
AMD’s 64-Core Threadripper 3990X, only $3990 Coming February 7th - erik
https://www.anandtech.com/show/15318/amds-64core-threadripper-3990x-3990-sd
======
ChuckMcM
When I left Sun in 1995 their "biggest" Iron was the Enterprise 10K (which
internally was called "Dragon" because of the Xerox bus) A system with 64
cores and 256GB of RAM was just under 2.5 million dollars list. It needed over
10kW of power provided by a 60A 240V circuit. The power cord weighed in at
like 20 lbs. I put together a new desktop with the TR3960 and 128GB of ECC
ram, that motherboard will take the 3990 and 256GB of RAM if I chose to
upgrade it. It really boggles my mind what you can fit under your desk these
days with a single 120V outlet.

~~~
jonas21
In 2001, the fastest supercomputer _in the world_ was ASCI White. It cost
$110M, weighed 106 tons, consumed 3MW of power (plus 3MW for cooling), and had
a peak speed of 12.3 TFLOPS.

Right now, sitting under my desk is a RTX 2080 Ti GPU which cost around $1000,
weighs 3 pounds, draws a maximum of 250 watts, and has a peak speed of 13.4
TFLOPS [1].

We truly live in amazing times.

 _[1] Not quite a fair comparison: the GPU is using 32-bit floating-point,
while ASCI White used 64-bit. But for many applications, the precision
difference doesn 't matter._

~~~
sillysaurusx
It's not fast enough. After having access to 160 TPUs, it's physically painful
to use anything else.

I hope in 20 years I'll have the equivalent of 160 TPUs under my desk.
Hopefully less.

The reason it's not fast enough is that ... there's so much you can do! People
don't know. You can't really know until you have access to such a vast amount
of horsepower, and can apply it to whatever you want. You might think "What
could I possibly use it for?" but there are so many things.

The most important thing you can use it for is fun, and intellectual
gratification. You can train ML models just to see what they do. And as AI
Dungeon shows, sometimes you win the lottery.

I can't wait for the future. It's going to be so cool.

~~~
cr0sh
> You can't really know until you have access to such a vast amount of
> horsepower, and can apply it to whatever you want.

Something I've often wondered, and there are probably good reasons why, is
that billionaire tech moguls - even the ones who are outwardly technical (or
were in the past - people like Bill Gates, who we know had technical chops in
the past) - that none of them (that I'm aware of) haven't ever tried to build
"their ultimate computer".

For instance, if I had their kind of money, I've often thought that I would
construct a datacenter (or maybe multiple datacenters, networked together)
filled with NVidia GPU/TPU/whatever hardware (the best of the best they could
sell me) - purely for use as my "personal computer". Completely non-public,
non-commercial - just a datacenter I would own with racks filled to the brim
with the best computing tech I could stuff into them (on a side note, I've
also pondered the idea of such a personal datacenter, but filled with D-Wave
quantum computing machines or the like).

What could you do with such a system?

Obviously anything massive parallelism could be useful for - the usual
simulation, machine learning, etc; but could you make any breakthroughs with
it - assuming you had the knowledge to do such work?

Which is probably why none have done it - at least as a personal thing.

I mean, sure, I would bet that people who own large swathes of machines in a
datacenter, or those who outright own datacenter (like Google or Amazon) -
their founders and likely internal people do run massively parallel
experiments or whatnot on a regular basis, ad-hoc, and "free" \- but it's a
commercial thing, and other stuff is also running on those machines...

But a single person is probably unlikely to have or think of problems that
would require such a grand scale before they would just "start a company to do
it" or something similar; because in the end, just to maintain and administer
everything in such a datacenter, if one were built, would require (I would
think) the resources of a large company.

Of course, then I wonder if such companies - especially ones like Google and
Amazon, which own and run many datacenters around the world, and also sell the
resources of them for compute purposes - weren't started in some fashion (even
if only in the back of their heads) by their founders with that idea or goal
in mind (that is, to be able to own and use on their whim "the world's largest
amount of computing power"...?

~~~
kjs3
Paul Allen kinda did just that, although in a different direction. He built a
datacenter and filled it with a bunch of old computers he thought were cool,
like the DEC PDP-10. It's now the Living Computer Museum in Seattle.

[https://www.pcworld.com/article/3313424/inside-seattle-
livin...](https://www.pcworld.com/article/3313424/inside-seattle-living-
computer-museum-pc-history.html)

------
faitswulff
The comments are mentioning the Xeons in Mac Pros and how Apple should switch.
I have no factual basis for this, but I figure Apple has got to be using AMD's
new chips as leverage to get some pretty sweet deals on Intel silicon.

~~~
wmf
Deals that Apple does not pass on to their customers.

~~~
jlgaddis
If people continue to pay Apple's (sometimes) outrageous prices, why should
they lower them?

(I'm just as guilty, having spent over two grand on MBPs multiple times!)

~~~
bcrosby95
Despite their price, there was a time when macbooks were only 5-10% more than
the PC equivalent laptop. People that were complaining about its price were
inevitably comparing it to bottom of the barrel PC laptops, not higher end
business laptops that had comparable specs.

I haven't priced out any recent macbooks to know if that's still true though.
Glancing at the new 16" macbook pro, it seems like it might be reasonably
priced for what you're getting.

~~~
x3sphere
Yeah, the MBP 16” is pretty comparable to the Dell XPS in price - at least
when comparing base models.

However, the costs go up a lot if you spec out a custom config (+$400 just for
32GB RAM). Then the MBP starts looking quite a bit more expensive. Overall I
don’t think they’re a bad buy though if you want macOS.

~~~
ksec
>Yeah, the MBP 16” is pretty comparable to the Dell XPS in price - at least
when comparing base models.

The Dell XPS [1] with a comparable spec cost $1650 compare to MBP 16" $2399.
In the old days Apple would have priced it closer to $2199 or slightly lower.

Somewhere along the line they started making Mac same margins as iPhone.

[1] [https://www.dell.com/en-us/shop/deals/new-
xps-15-laptop/spd/...](https://www.dell.com/en-us/shop/deals/new-
xps-15-laptop/spd/xps-15-7590-laptop/xnber5cr656ps?view=configurations)

------
OkGoDoIt
“consumer variant of the 64-core EPYC”

At nearly $4000 for just the CPU, is that still consumer territory? I assume
only huge companies would spend that much money on a single CPU.

~~~
hutzlibu
There is a huge gamer market. Every hardcore gamer wants to have the fastest
cpu avaiable.

Those who can afford it, will buy it.

edit: yes, it is definitely overkill for most if not all avaiable games, but
in a certain scene "overkill" is considered awesome

~~~
k_sze
Does the CPU actually matter that much for modern AAA games? (I haven't played
any AAA game in a looooooooooong time.)

~~~
colejohnson66
LinusTechTips tester it, and, IIRC, the answer is CPU bottlenecks aren’t
really a thing anymore.

~~~
rasz
They sure are a thing, because programmers. For example for a stutter free
game RDR2 needed 6 cpu cores up to a patch one month ago thanks to bad console
first optimizations.

~~~
colejohnson66
The key word was _really_. There are obviously times where CPU bottlenecks are
a thing, but for the most part, they’re not.

------
sytelus
So wouldn't this be like 2.6 TFLOPS? I'm thinking if this can replace NVidia
V100s to train something like ImageNet purely on CPU. However, V100 has 100
TFLOPS which seems 50x more than 3990X. Perhaps, I'm reading the specs wrong?

PS: Although FLOPS is not a good way to measure these stuff, it's a good
indication of possible upper bound for deep learning related computation.

~~~
choppaface
DDR4 has a bandwidth of about 25 Gigabytes per second. The memory on a V100
does about 900 Gigabytes per second. Cerebras has 9.6 Petabytes per second of
memory bandwidth. For stochastic gradient descent, which typically requires
high-frequency read/writes, memory bandwidth is crucial. For ImageNet, you're
trying to run well over 1TB of pixels through the processing device as quickly
as possible while the processor uses a few gigabytes of scratch space.

~~~
shaklee3
DDR has a bandwidth of about 25GBps _per channel_. You can hit around
100-200GBps on Epyc processors if you're utilizing ram efficiently. GPUs tend
to enforce programming models that ensure more sequential accesses, but CPU
can do it too.

~~~
choppaface
oh that's true thanks! I knew GDDR had higher bandwidth but the gap seemed a
little high when I looked it up

------
robviren
Some product manager smiled as they entered in the pricing for this product.
Screw the margin, make it match the SKU.

~~~
IanCutress
True story, AMD's original price was different. I suggested this price in a
pre-briefing the night before. I got an email at 4am of the announcement to
say it had been changed.

------
SXX
Can someone clarify what exactly is cut here compared to EPYC 7742 other than
PCIe lines? I don't quite see really get how AMD want to avoid competing with
their own product.

~~~
wmf
They also cut multi-socket and half the memory channels. It's only ~10%
cheaper than Epyc 7702P so AMD is not hurting themselves.

~~~
SXX
Ah okay. I missed half of memory channels so it does make sense now.

------
ComputerGuru
I haven’t seen any reviews taking into account the price of unregistered DDR4
RAM at the densities required to use these things. AMD is suggesting 2GB/core
or 128GB, but these HEDT CPUs cannot support the typical registered RAM used
by server memory kits and require unregistered desktop RAM (for market
segmentation, you see) which can be (depending on SKU) quite a bit more
expensive. I feel any price comparisons with Epyc or Xeon need to take that
into account.

------
geokon
This is a bit tangential but maybe someone could give me a some advice.

When running with many cores is there some way to get the Linux kernel to run
them at a fixed frequency?

I've been doing multicore work lately on AWS - which works pretty well but you
don't get access to event counters so sometimes I'm having trouble zeroing in
on what the performance bottleneck is. At higher core counts I get weird
results and I can never tell what's really going on. Running locally I have
concerns about random benchmark noise like thermal throttling, "turbo" and
other OS/hardware surprises (I know on modern chips single thread stuff can
run really different from when you load all the cores). I've been thinking of
getting something a bit dumber like an old 16 core Xeon (I'm on a bit of a
budget) and clocking it down - or maybe there is some better solution?

~~~
alvern
I don't have a answer to your frequency problem but I ran into similar issues
with memory usage when trying to process LIDAR data in pandas and keras. I
ended up buying an old HP quad xeon 4U with 320gb of DDR3 for $330 shipped. I
used [https://labgopher.com/](https://labgopher.com/) to find the server
deals.

~~~
tracker1
Thanks for sharing labgopher... would have made my own search much easier last
year.

------
gameswithgo
i think the smart buy (for most) is still the 32 core threadripper. it is a
lot less money with much higher clocks and is less likely to be ram throughput
starved.

------
ilaksh
I've been experimenting with some ideas for new ML approaches (not neural
networks). I was thinking about playing around with FPGAs, which the high end
ones are really expensive. 64 cores is making me think it probably is not
worthwhile to focus on FPGAs or even necessarily GPU programming like I was
thinking before.

~~~
dragontamer
With 64-lanes of PCIe on this Threadripper... why not all three?

Plenty of PCIe space to afford a few GPUs to experiment with your 64-core CPU.
You probably can shove 2x GPUs + 1x FPGA into one box.

~~~
ilaksh
Right and it is possible that by doing something like that the power of the
system would be multiplied by many times. It's really more about the
programming model.

~~~
dragontamer
I'm not sure if that's a big deal with a single box.

Threadripper is only 300W+ under load. Idles for the CPU are ~100W for the
whole system. ([https://www.kitguru.net/components/cpu/luke-hill/amd-
ryzen-t...](https://www.kitguru.net/components/cpu/luke-hill/amd-ryzen-
threadripper-3960x-3970x-cpu-review/11/))

GPUs are similar. GPUs take ~20W while idling, but 300W or 500W while under
load. FPGAs are also similar.

\-------

The total system idle of a Threadripper + 2x GPU + FPGA rig probably is under
200W.

If you happen to utilize the entire machine, sure, you'll be over 1000W. But
you'll probably only utilize parts of the machine as you experiment and try to
figure out the optimal solution.

A "mixed rig" that can handle a variety of programming techniques is probably
what's needed in the research / development phase of algorithms. Once you've
done enough research, you build dedicated boxes that optimize the ultimate
solution.

~~~
ilaksh
By power I meant compute power. Did not mean power draw.

~~~
dragontamer
Ahh, gotcha. I guess I misunderstood.

Your earlier comment makes sense now that I understand what you're saying.

------
tedunangst
16 cores per memory channel seems to be pushing it, if 47% cinebench
improvement is any indication.

------
sitzkrieg
i want one of these solely to shave days off of waiting for fpga synthesis.
even on a 3900x and smaller hobby board things can take hours

now whether its worth selling a body organ to kit out..

------
kchoudhu
How does Windows 10 licensing work for this number of cores?

~~~
ChuckNorris89
Honest question: do people who need 64 cores use Windows? I assumed these
types of workstations and workloads are 99% Linux based.

IIRC windows's scheduler wasn't as good as linux's at managing these kind of
paralel workloads.

~~~
davidrm
Yes, we do :) Well, maybe not 64 cores.

In embedded/automotive, majority of the tooling does not have a linux version.
Compiling is a bitch. Still, you're probably right about the 99% comment.

------
rafaelvasco
Having a blast with Ryzen 5 3600X. Only problem is that Win10 appears to have
a bug with it. Stutters all over. Only stopped when I reinstalled the newest
chipset drivers and set the Ryzen Balanced energy profile. Windows default
energy profiles all stutter, fans make too much noise etc. Now it stays
variating clock from ~3.8 to ~4.2. Before it was fixed at 3.79.

~~~
swebs
Have you tried it in Linux? When the first Threadrippers came out, I remember
seeing people having all sorts of problems with them in Windows 10, with the
same machines running in Linux outputting double the performance.

~~~
rafaelvasco
Yeah, probably a Win10 driver issue. When i updated it, it fixed it. Before i
event got a BSOD because of it.

------
naveen99
When can I buy the new amd vega 2 without having to purchase a Mac Pro ?

------
PHGamer
its a bit excessive in the prosumer market but these will make great cheap
home lab machines.

