
AMD EPYC processors come to Google, and to Google Cloud - jhealy
https://cloud.google.com/blog/products/compute/amd-epyc-processors-come-to-google-and-to-google-cloud
======
joosters
Wow, first photo is a 2000x1489px PNG (4.5MB) and it is scaled to just
451x335px. (not even clickable to view at full size, it's just for page
decoration)

Nice web design, google!

~~~
andreasley
There's also an animated GIF (Turn_it_to_11.gif) that weighs 6.7MB and is not
even visible until you click "Show related articles" at the bottom. And even
if you do that, it's only displayed in 164x82px instead of the original
2880x1200px.

Looks like Google's CDN works so well that creating resized versions to save
bandwidth is not worth it. :)

~~~
adossi
Sounds like their automatic compression functionality on their CDN isn't
exactly working the way its supposed to. I know AWS Cloudflare has something
like this that I use in conjunction with saving website images on S3.

------
bhouston
With 200 CPU threads (128 CPU cores), I think most startups can run a complex
microservice environment on a single machine.

Although Kubernetes requires 3 to start, but still, this greatly reduces the
need to have a lot of separate machines.

256 threads per machine this year. 512 threads per machine 2 years from now?
And then hopefully 1024 threads per machine 4-5 years from now? That would be
really fun.

(I will take a laptop in 4 years with just a lowly 64 cores please, leaving
the heavy iron for the cloud machines.)

I want Moore's law back but in parallel form. The years since 2004 with x85
machines have been quite boring from a CPU performance increase perspective.

~~~
kevinconaway
> With 200 CPU threads (128 CPU cores), I think most startups can run a
> complex microservice environment on a single machine.

Why would you want to? Your whole environment will be down when the machine,
or some component thereof, fails or is rotated out for maintenance

I think this is more of a benefit for cloud providers in that they can pack
more disparate customer workloads on to a single machine.

~~~
mike_hearn
So have two of them with synchronous replication and fail over between them.
You have to think about dying or suddenly slow machines even in a
microservices architecture.

The advantage of such a machine is that if it starts to die, you fail over
everything atomically and then repair/replace the backup. You don't need to
think about what happens if one component is on a dead machine and the others
aren't (does your load balancer handle that well, given machines often "die"
in ways that just make them slow rather than totally failed?)

The big win though is if you get rid of the microservices and run the whole
thing in one big process. No complex RPC failures, obscure HTTP/REST attacks
like [https://portswigger.net/blog/http-desync-attacks-request-
smu...](https://portswigger.net/blog/http-desync-attacks-request-smuggling-
reborn) and so on.

That might sound mad but modern JVMs can run lots of languages fast, and have
ultra-low-pause GCs that can use terabytes of heap. Like less than one msec
low. Many, many businesses fit into these really high end machines with a
giant JVM.

I've written about the possibility of a swing back to big iron design here:

[https://blog.plan99.net/vertical-
architecture-734495f129c4](https://blog.plan99.net/vertical-
architecture-734495f129c4)

And a look at modern Java GCs - GC being historically the bottleneck to really
large single processes:

[https://blog.plan99.net/modern-garbage-collection-
part-2-1c8...](https://blog.plan99.net/modern-garbage-collection-
part-2-1c88847abcfd)

------
KuiN
Not a hugely surprising development given what was discussed yesterday:

[https://news.ycombinator.com/item?id=20640148](https://news.ycombinator.com/item?id=20640148)

~~~
pankajdoharey
If they bring a 64 Core to the desktop, Intel will have no market for yrs.

~~~
krylon
This might sound like a silly question, but what would one use a 64-code CPU
on a desktop machine for? Or more precisely, in what situations is a 32-core
ThreadRipper2 insufficient?

~~~
r00fus
I don’t think even enthusiasts can fully engage a 64 core device with any
meaningful uses that a 32 core wouldn’t suffice.

This is a server chip that excels at virtualization.

~~~
mmrezaie
Then you are not a target. There are many scientific workloads that actually
can benefit from shared memory multi core processors like this. They do not
scale properly in cluster. Data Science is very simple and obvious target.
Developers also can benefit a lot. Have you worked on a very large project of
mixed c++ and Fortran + for some reason python (Compiling or profiling)? This
is god send. I am currently working on a single node with 2x28 core and I
still need more cores. For licensing issues I cannot use more than one node
for my task.

------
bhouston
What is the pricing? It is cheaper per core-hour than Intel? It should be.

Also how many machines will they have? Will it be a token amount in a single
data center or will there be a good chunk of these around in multiple data
centers?

~~~
asdfjbuiobio
Why should it be cheaper than Intel? The performance is similar. The fact that
it's cheaper for Google to run doesn't mean it's less valuable for the end
user.

If anything, they should cut prices across the board. AMD just cut the price
of an x86 core in half.

~~~
bufferoverflow
Because Intel has a better brand name / recognition. Many companies default to
Intel, even when it's not the cheapest.

I'm glad AMD is making such progress that these companies can't ignore them
anymore. Having one CPU maker would be horrible.

------
polskibus
I'd love to see a comparison of new EPYC vs Intel on database workloads. Acc.
to recent anandtech article, which quotes Intel, memory access can give large
edge to Intel.

~~~
olavgg
I would love to see benchmarks with Clickhouse which scales much better than
regular SQL databases on a single machine.

~~~
zX41ZdbW
ClickHouse is happy to use multiple cores if the query is heavy enough. We
have tested it on AMD EPYC 7351 more than a year ago and get promising
results. (I have not saved them but I'll try to reproduce and post them here.)

Another case of scalability: we have also tested ClickHouse on an Aarch64
server (Cavium ThunderX2) with 224 logical cores and despite the fact that
each core is 3..7 times slower than Intel E5-2650 and the code is not
optimized as much as for x86_64, it was on-par in throughput of heavy queries.

There are also tests of ClickHouse on Power9 if you mind...

------
tinco
I hope they'll also attach GPU's to those machines. We switched part of our
operation to local hardware because wouldn't get both the fast GPU's and the
fast CPU's in the same node.

A proper solution of course would be to have the CPU intensive algorithms run
on different nodes, but it's an integrated solution we pay for so we don't
control that.

~~~
ImJasonH
You can already get 1-4 GPUs on VMS with 64-96 vCPUs in a number of regions:
[https://cloud.google.com/compute/docs/gpus/#gpus-
list](https://cloud.google.com/compute/docs/gpus/#gpus-list)

Is the problem that the CPU max scales with the number of GPUs, so you can't
get 1 GPU with 96 vCPUs?

~~~
tinco
No, we were looking for the high single thread performance machines, so the
new compute optimized cascade lake ones, but those you can't get gpu's on.

