
Compute Engine machine types with up to 96 vCPUs and 624GB of memory - ramshanker
https://cloudplatform.googleblog.com/2017/10/new-compute-engine-machine-types.html
======
dx034
Why would you use that instead of bare metal? You'll come out >$4k per month
with sustainable use, for that price you get the same power in bare metal?

Preemptible is a bit cheaper but are 600GB memory really worth it for short
running applications? Until you loaded everything in memory your machine
probably gets destroyed..

EDIT: Not sure about the exact CPU performance but should be quite close to
what OVH offers here? With the same memory configuration and 2TB NVMe this
still costs <$1,500/month
([https://www.ovh.co.uk/dedicated_servers/hg/180bhg1.xml](https://www.ovh.co.uk/dedicated_servers/hg/180bhg1.xml)).

~~~
earlybike
> >$4k per month with sustainable use, for that price you get the same power
> in bare metal?

not so sure about that + a hoster who provides such a machine as bare metal
wants a setup fee, needs time to setup and a minimum contract duration much
longer than one month

guess there are not many hosters which have such a beast as bare metal in
stock and available in few minutes (are they any hosters at all?); they will
order sch machine themselves and you will wait at least a week

~~~
corford
You could order 16 of these [https://www.hetzner.com/dedicated-
rootserver/px121-ssd](https://www.hetzner.com/dedicated-rootserver/px121-ssd)

Minimum contract length: 1 month, total cost (including setup): $4,384 (on-
going month to month cost thereafter: ~$2,191.20).

For that you'd get an aggregate total of 4TB RAM, 7.6TB HDD (SSD), 96 real
Intel E5-1650 v3 cores (or 192 vCPUs) and 800TB of bandwidth.

Sprinkle with terraform/ansible/k8s/docker and you have a resilient, massively
powerful compute cloud with no long term obligation that's about half the
price of GCE if you keep it around beyond 30 days. Or another way to look at
it: if you needed such a platform for two years, your second year would be
free compared to GCE.

One major issue with this approach (versus GCE's "all in one" box) could be
network performance bottlenecks depending on what task(s) you were using such
a cluster for.

------
djb_hackernews
Interesting that usually GCP is one upping AWS on some metric but in this case
it isn't touching the current largest compute/memory instance on EC2, the X1
family with up to 128 vCPUs and 4TB of memory. Though the blog does allude to
them testing such types in a closed beta, it still is a game of catch up
still.

~~~
peatmoss
Yikes, I'm a little out of the loop. I didn't realize you could get 4TB of RAM
in a single machine on EC2.

I've been seeing that medium data keeps getting bigger (i.e. the features of
traditional RDBMS are eating away at the need for specialized / distributed
stores for data analysis). But so too does it appear that small data is
getting a lot bigger too—just load that dataset into memory for analysis. 4TB
of memory allows for pretty big "small data."

"I remember back when we used to do gradient descent to estimate linear
models; back in the long ago when we didn't have 900 exabytes of memory
attached to our NVidia Matrix Crusher 9000 linear algebra accelerator unit."

~~~
sqldba
Is it possible that data isn't getting bigger - but that the people who work
with it just want to process larger data sets than before?

I mean before they'd train a model of 1,000 inputs and then test it against
another 50 and call it a day. Now they want to train it against 1,000,000
inputs.

Am I completely off base? It's not my area, though I work with databases, my
observation is that developers always want to use the most data possible even
when it doesn't really provide any benefit.

~~~
peatmoss
Sorry, I was being a bit playful with language. What I mean is that, if you
roughly define small, medium, and large data in terms of the strategies
required to process, then the absolute size of the data that can be processed
using simpler methods grows.

And whether or not more data is needed or collectible varies by discipline.
Astrophysics collects way more data than they used to because 1. they need it.
2. instrumentation allows it.

Some kinds of data collection hasn't scaled up however. Surveying humans is
expensive and labor intensive. And for many things that you might want to
study about humans, you can't simply afix a sensor to them. So, what might
have been only accomplished through big data, or medium data methods a few
years ago can now be loaded into memory (i.e. small data strategies).

------
positivecomment
Could be useful if you wrote your neural net with Applescript, I guess.

~~~
skj
I agree. Once we passed the 640k mark it all became an excuse for lazy tool
development.

------
jackmott
If a VM with 96 cpus is being offered, how many cpus does the bare metal
behind the VM have? What is the hardware like here?

~~~
iofiiiiiiiii
I imagine you get that 96 if you stick together 4x Xeon CPUs with 24 cores
each onto one server.

[https://www.intel.com/content/www/us/en/products/processors/...](https://www.intel.com/content/www/us/en/products/processors/xeon/e7-processors/e7-8890-v4.html)

~~~
alxv
You only need 2 processors, since each cores gives you 2 vCPUs. "For the n1
series of machine types, a virtual CPU is implemented as a single hardware
hyper-thread" \--[https://cloud.google.com/compute/docs/machine-
types](https://cloud.google.com/compute/docs/machine-types)

~~~
jackmott
so the bare metal might still have 4, such that it could host 2 of these VMs
at once I guess.

------
aneutron
I honestly wonder why did Google cite "SAP HANA" as the _reference_ workload
for such setups. I never heard of that product. Noobie asking here, is it the
_reference_ workload for such setups ? And are there more demanding workloads
?

~~~
Jdam
Maybe because SAP HANA requires vertical scaling. For horizontal scaling, I
don't see a point in using such big instances.

~~~
mrich
You can scale HANA vertically and horizontally. Vertical makes sense for ERP
workloads where you want low latency (avoiding communication). The ERP
workloads of most companies can fit in RAM easily nowadays. Horizontal scaling
is best for data warehousing/analytical workloads, which tend to be more CPU
bound due to lots of calculations.

~~~
Jdam
> You can scale HANA vertically and horizontally

Nah, with scaling, I‘m more referring to approaches of eg Cassandra instead of
„add another read replica“.

~~~
mrich
I wasn't talking about replicas either. HANA can distribute data over many
clusters.

[https://blogs.saphana.com/2014/12/10/sap-hana-scale-scale-
ha...](https://blogs.saphana.com/2014/12/10/sap-hana-scale-scale-hardware/)

HANA's roots lie in BWA (SAP BW accelerator), whose main game was distributing
large amounts of data across commodity clusters.

------
cottonseed
Did they increase the per-instance network bandwidth caps? Previously
instances had 2Gb/s/core of network bandwidth capped at 16Gb/s, which makes
8-core nodes the sweet spot for network bandwidth.

I never quite understood why this is necessary if they're cutting up larger
machines. Why should the total network cap matter if I have two 8-core
instances or one 16-core instance on the same physical machine?

edit:

[https://cloud.google.com/docs/compare/data-
centers/networkin...](https://cloud.google.com/docs/compare/data-
centers/networking)

"The egress traffic from a given VM instance is subject to maximum network
egress throughput caps. These caps are dependent on the number of cores that
the VM instance has. Each core is subject to a 2 Gbps cap for peak
performance. Each additional core increases the network cap, up to a
theoretical maximum of 16 Gbps for each instance. The actual performance you
experience will vary depending on your workload. All caps are meant as maximum
possible performance, and not sustained performance."

~~~
brianwawok
I doubt the average mysql server needs more than 16gb of bandwidth.

~~~
sqldba
... let me show you the queries the developers are writing ^^

~~~
brianwawok
So you need to fix the developers not upgrade your network cards :)

------
mattparlane
Can you really spin one of these up on demand? That obviously means that they
have machines of that size (or greater) sitting idle, waiting for someone to
use them. That's mind-boggling.

~~~
kevin42
Yes, I configured a 64 cpu machine in a matter of minutes, ran a test on it
for 24 hours, then shut it down and deleted the instance. Total cost was
around $200.

~~~
tgtweak
*vCPU.

But yes, they probably strive to keep below 10% vacancy on all hardware.

------
em3rgent0rdr
Intel pages says these 28-core hyperthreaded processors work for 8+ sockets.
Let's see...8 sockets * 56 virtual processors per socket = 448 virtual
processors potentially in one VM.

[https://www.intel.com/content/www/us/en/products/processors/...](https://www.intel.com/content/www/us/en/products/processors/xeon/scalable/platinum-
processors/platinum-8180.html)

------
claudius
Does someone have experience using this (or other) cloud solutions in a
scientific HPC context? The only obvious disadvantage I can think of is that
every student running something incurs a certain cost whereas after one has
bought the actual computers, running jobs is somewhat free.

~~~
brutos
Depends heavily on the projected utilization. If you know your compute node is
going to be computing for the next 3 years with at least medium utilization,
then the self hosted metal is probably going to be quite a bit cheaper.

Its amazing how much hardware you can pack into a single machine for 10k€.
Last year our group bought two additional high-memory (768GB) nodes for around
that price each (including support for a couple years from the vendor).

A few years before we bought 40 nodes with 128GB RAM each, for a similar price
to last years high-memory nodes (and a fast interconnect and a lot of
storage).

If you are at a larger research institution, you probably also have an IT
department that can co-locate your hardware for next to nothing (compared to
cloud). There you also will save a lot of ingress/egress, storage, backup,
etc. costs.

Regarding the per student costs, even with cloud instances I would consider
running a traditional HPC job system (grid engine, lsf, torque, ...). The MIT
had a nice solution with Starcluster [1] to easily deploy a SGE on AWS. It
looks a bit dead now though.

[1] [http://star.mit.edu/cluster/](http://star.mit.edu/cluster/)

~~~
dx034
> Depends heavily on the projected utilization. If you know your compute node
> is going to be computing for the next 3 years with at least medium
> utilization, then the self hosted metal is probably going to be quite a bit
> cheaper.

Isn't that already the case for 1 month? Bare metal doesn't mean own data
centre or colocation. If you go with a hosting provider most offer dedicated
hardware on a monthly contract. As long as you need them longer than 1-2
months that should be significantly cheaper than Google/AWS.

~~~
vidarh
That's usually the case for almost everything on AWS/Google. If you're using
them for specific features, or for very bursty work (e.g. if you use the
instances less than about 6-8 hours a day), they can be cost effective, but
the moment you use instances full time and don't leverage/depend on a ton of
extra services, you're paying way above the odds.

~~~
brianwawok
Kubernetes helps this a bit with bin packing. Much easier to keep 3 32 core
servers loaded than 32 4 core servers.

~~~
vidarh
But that's the case whether you're using AWS or self-hosted, so it doesn't
really alter that calculation much

------
_joel
I couldn't see much info about how NUMA is handled

~~~
zbjornson
It's not exposed through their virtualization currently. I suspect it has
something to do with making live migrations a lot harder.

------
amelius
With that many vCPUs I guess it becomes important to talk about things like
how the memory hierarchy is wired.

I'm surprised no benchmarks are mentioned.

------
mahdix
I wonder who will use these types of instances?

~~~
malux85
scientific computing and simulation. I have had need for machines like this -
for a project that was very compute heavy but embarrassingly parallel, but
also was very "chatty" \- I.e updated the data structures a lot during compute

I can't be too specific about it, but if involved creating a very large tree
structure and updating, pruning and transversing the tree a lot

If the algorithm is updating and reading a large data structure a lot it's
only practical from a speed point of view to hold the whole structure in RAM

~~~
neutronicus
Is this really a better solution than just getting hours on NERSC (or some
other government supercomputer)?

~~~
tallanvor
Private companies want to do simulations as well, and with this type of
solution, you can pretty much run them on demand rather than having to wait in
line.

~~~
neutronicus
...I _guess_.

My advisor's company straddles the public / private divide, but we've
definitely done some simulations for private clients on NERSC, and I assume we
weren't misusing hours allocated for some other purpose.

------
jijji
price per second?

~~~
roter
Pricing is here [0]. The 96 vCPU, 360GB instance costs $4.9405/hr. If you
don't mind if it gets killed, $1.0401/hr (pre-emptible [1]).

[0]
[https://cloud.google.com/compute/pricing#machinetype](https://cloud.google.com/compute/pricing#machinetype)

[1]
[https://cloud.google.com/compute/docs/instances/preemptible](https://cloud.google.com/compute/docs/instances/preemptible)

------
samfisher83
What kind of computer is this? What motherboard supports this much memory?

~~~
Cthulhu_
It's virtual, so it's up in the air. What motherboard supports 96 CPUs?

~~~
mschuster91
vCPU == cores. 2 Xeon Platinum 8180 get you 114 vCPUs.

------
thisoneforwork
Azure's M128 VM size has 128 vCPUs and 2048GB of memory already.

~~~
packetslave
sure, for $20/hour

~~~
thisoneforwork
This is not a SKU to play with. If $20/hr is indeed the price (I don't know),
this is the hourly cost of a couple of waiters. You get to run SAP on
someone's infra and someone to support it.

------
qhwudbebd
And still no IPv6. In 2017.

------
leifaffles
Holy moly!

------
qaq
Can we stop this whole vCpu nonsense

~~~
qaq
Do people really enjoy not knowing what they are buying some providers provide
some info on what vCPU is other don't. Many people think it's actual CPU core.

