AWS Graviton2

otterley · on Jan 25, 2020

(I work for AWS. Opinions are my own and not necessarily those of my employer.)

I've been doing some initial M6g tests in my lab, and while I'm not able to disclose benchmarks, I can say that my real-world experience so far reflects what's been claimed elsewhere.

Graviton2 is going to be a game changer. It's not like the usual experience with ARM where you have to trade off performance for price, and decide whether migrating is worth the recompilation effort. In my lab, performance of the workloads I've tried so far is uniformly better than on the equivalent M5 configuration running on the Intel processor. You're not sacrificing anything by running on Graviton2.

If your workloads are based on scripting languages, Java, or Go, or you can recompile your C/C++ code, you're going to want to use these instances if you can. The pricing is going to make it irresistible. Basically, unless you're running COTS (commercial off-the-shelf software), it's a no-brainer.

m0zg · on Jan 25, 2020

I feel like this is where hyperthreading is finally starting to bite Intel in the rear. Cloud providers have been selling "VCPUs" that aren't actual cores. I best most customers don't even know what they're buying. Even if ARM cores are slower (and they don't really have to be), they're still going to be faster than hyperthreads.

lallysingh · on Jan 25, 2020

I don't think so. Most apps get 0.5-0.6 instructions completed per clock cycle, and Intel cores can put out multiple instructions per cycle.

Most customers don't measure, and don't optimize cache usage for the actual tradeoffs (cache, tlbs) to matter.

m0zg · on Jan 25, 2020

Most, yes. But things like databases, video/data compression, compute / deep learning workloads, etc, _are_ negatively affected by the fact that cores aren't really cores. Basically anything that's actually using the CPU to an appreciable extent will be affected by that. Add to that the hyperthreading-specific CVEs as well.

rumanator · on Jan 26, 2020

I'm confident that those working on applications where CPU performance is an important issue, as well as reliability of said performance, are actually not running their critical applications on virtualized infrastructure. The cloud is good for "good enough" solutions where architectural design leveraging horizontal scaling does the job well. In those applications, how a CPU is used or what their performance is is something that falls in the wrong side of the domain of microoptimization. In the Cloud world, the only thing that's relevant is if an application infrastructure needs to scale, and what's the financial impact of that in the operational cost.

m0zg · on Jan 26, 2020

Your confidence is misplaced. Netflix is a well known user of cloud services for encoding, and just about everyone runs at least part of their deep learning workloads on AWS or Google Cloud. Not to mention databases.

rumanator · on Jan 26, 2020

Encoding is an embarrassingly parallel problem which is trivially scalable by launching new instances. That usecase fits precisely the scenario where CPU raw performance is not an important issue, and the cloud is already good enough.

lallysingh · on Jan 26, 2020

That's fine for small counts. When you're taking a 20-50% efficiency hit for this, that's a good sized bill difference.

rumanator · on Jan 26, 2020

Performance is not tied with price, only with instance count and the amount of computational resources assigned to each virtual instance.

rbranson · on Jan 25, 2020

I am extremely excited about this development. Less so getting everything that assumes x86 in the stack supporting a “parameterized” platform.

(Hi Michael!)

otterley · on Jan 25, 2020

Hi Rick! I know it's only a part of the toolchain needed to support the migration, but multi-arch Docker image support for Amazon ECR is definitely on our immediate roadmap.

floatboth · on Jan 25, 2020

heh, I've had a very fun experience with spotinst.com — my a1 spot instance went down and that service couldn't restore it because it did not label the AMI as arm64. Reported that to them, got a couple acknowledgements but haven't heard back in a while, so presumably this is still not fixed.

vbezhenar · on Jan 25, 2020

It is surprising, because I was under impression that Java has so many optimizations for x86 and ARM was so new that it's almost impossible to beat x86 without very significant investments. It's nice to hear that I was wrong.

ianopolous · on Jan 25, 2020

In a previous job, we had a cross compiler that cross/decompiled a static x86 Linux binary to Java bytecode (with syscalls emulated in Java). Testing this on an ARM processor the Java version of a scientific benchmark we were using was significantly faster (1.6X) than the best gcc could do with the original C. This wasn't anything fancy our cross-compiler had done, but the JVM itself. This isn't a comparison to x86 hardware, just an indication of the effort put into the ARM based JVM.

Twirrim · on Jan 25, 2020

ARM has been so dominant in the mobile market that there has been a lot of effort around ARM optimisations, both in compilers and interpreted languages. Way more so than alternative architectures like MIPS etc.

thu2111 · on Jan 26, 2020

Java doesn't have that many x86 specific optimisations actually. It does some auto-vectorisation, it supports things like AES-NI and other specialised hw instructions, but, those are easy to port to ARM.

The vast bulk of the effort in modern JVM compiler optimisations is more at the program structural level: removing allocations, merging methods together so they can be optimised as a whole, removing abstraction, and so on. All that stuff is CPU independent.

pm215 · on Jan 25, 2020

It's not very "new". People have been working on porting, improving and optimizing software for the Arm server ecosystem for more than a decade now; really performant and widely available hardware may be new on the scene, but it would be silly to wait for that before starting work on the software side of things...

lallysingh · on Jan 25, 2020

x86's age and complexity give it a significant disadvantage. Both the cache coherency model and the instruction set incur a lot of overhead to do at speed.

atdt · on Jan 26, 2020

That's very interesting. Could you elaborate?

hnuser123456 · on Jan 26, 2020

A lot of the most commonly run software out there doesn't use the large, complex instructions offered on x86, so a bunch of pristine silicon goes to waste. Use the space taken by AVX512 etc to make more, simpler cores, and you get more performance for the same price, or less cost for the same performance. Simpler cores are easier to clock higher with less voltage, and less likely to have defects that would pull down yields.

Dylan16807 · on Jan 26, 2020

The big vector units aren't the problem though. They're a consequence of the big complicated schedulers that most x86 cores are designed with. As long as the core has to be huge anyway, you might as well spend some space on more powerful math units.

It's possible to design an x86 chip with much more priority on throughput per square centimeter, with many more simple cores working together, but I have no idea how it would work out.

lallysingh · on Jan 26, 2020

There's a lot of logic and complexity on decoding, fusing, etc.

jchw · on Jan 26, 2020

FWIW, 32 bit x86 (i386) and ARM arrived in the same year (1985.)

QuinnyPig · on Jan 25, 2020

I too have been playing with m6g, and while I’m allowed to disclose benchmarks I haven’t bothered to run any; the pedantry that unleashes is enough to drive me to drink.

“Lies, damned lies, and benchmarks.” I can say that the qualitative experience is superb; everything “just works” as you’d expect it to, and performance is stellar.

chroem- · on Jan 25, 2020

Does the chip still manage to have superior power efficiency versus x86 even at these performance levels?

chris_overseas · on Jan 25, 2020

From the comments section of the article:

> Because there are so many power sensitive applications where ARMs are used, much has been invested in power minimization and management and they do very well. It’s easy to get remarkably better power consumption with an ARM part. But, in this particular case, our focus was more on server-side price/performance and, with that focus, our power consumption isn’t really materially better the alternatives.

ksec · on Jan 25, 2020

Just a small reminder this is comparing a x86 on Intel 14nm+++++ and another one on TSMC 7nm. i.e the power efficiency would likely have absolutely nothing to do with the ISA in use.

QuinnyPig · on Jan 25, 2020

To my uninformed understanding, power efficiency wasn’t really the design goal. If it’s AWS’s power bill, do we care as much (presuming sustainable energy)?

marcinzm · on Jan 25, 2020

>If it’s AWS’s power bill, do we care as much (presuming sustainable energy)?

I care because it's an interesting question regarding future trends in technology. More broadly, people care about a lot more than just what has direct short term applications to their jobs.

QuinnyPig · on Jan 25, 2020

Fair. Trouble is, we’re never going to get Graviton2 devices of our own outside of an AWS datacenter or device; the power profile is intellectually interesting, but not likely to ever enter the public sphere.

nine_k · on Jan 25, 2020

I'm not sure why Amazon won't try to recoup some of the investment by designing and marketing other boards / devices with this chip, or by just selling it in quantities to some manufacturers (think telecom, home entertainment, industrial equipment, etc).

QuinnyPig · on Jan 25, 2020

Based upon my experience, I’d say it’s going to be profitable in its own right just by powering EC2 instances. The same argument could be said to apply to Apple’s ARM chips.

rumanator · on Jan 26, 2020

> the power profile is intellectually interesting, but not likely to ever enter the public sphere.

It does have an impact on operational cost, and cost is a very important factor.

marcinzm · on Jan 25, 2020

If something is possible and economical then others will copy it in time.

Scaevolus · on Jan 25, 2020

Power and heat dissipation are a huge part of TCO for datacenters. I'm sure Amazon picked an appropriate point on the performance per watt curve.

qeternity · on Jan 25, 2020

Given that your cost is going to be correlated to AWS’s cost of goods sold, I’d argue of course we care.

QuinnyPig · on Jan 25, 2020

I care what AWS charges me; I don’t have the energy to care what their underlying cost structure and its constituent parts looks like.

qeternity · on Jan 25, 2020

As I said in my comment, their underlying cost structure has a direct relationship to the cost you incur (at least in the long run with semi efficient markets). There’s a reason why commodity markets tend to revert towards variable cost.

ksec · on Jan 26, 2020

> There’s a reason why commodity markets tend to revert towards variable cost.

Is that some theory of economical model ?

Or is that a way of saying the final price of a product varies greatly because the commodity part with the product is only a small percentage of the TCO / BOM ?

wmf · on Jan 25, 2020

Indeed; we shouldn't care about AWS power efficiency but we also shouldn't assume that it's bad just because we can't see it.

cwyers · on Jan 26, 2020

I can think of exactly 0 circumstances where Amazon wouldn't pass on its power bill to its customers.

adrianN · on Jan 26, 2020

> presuming sustainable energy

Pretty big presumption.

wyldfire · on Jan 26, 2020

Hmm, so did Qualcomm miss out on a huge sales opportunity by bowing out? Or did they luck out not having to compete against vertical integration?

pankajdoharey · on Jan 26, 2020

Any possibility of Graviton or ARM based computers hitting the main stream computing devices like Laptops aswell?

wmf · on Jan 26, 2020

Microsoft and Qualcomm are selling ARM laptops.

pankajdoharey · on Jan 26, 2020

Those are probably using a smartphone processors, the die-size alone of Graviton2 makes it desktop/server class.

skavi · on Jan 27, 2020

The Neoverse N1 cores used in the Gravitron2 are very close architecturally to the Kryo 495 Gold cores in the 8cx. Of course, the 8cx only has 4 of them, while this has 64.

haerwu · on Jan 26, 2020

Qualcomm 8cx cpu is based on smartphone one but optimized for laptop use.

staticassertion · on Jan 25, 2020

Would these chips be reasonable for something like an i3en class?

marton78 · on Jan 26, 2020

Will it be avail available on Fargate?

Dunedan · on Jan 25, 2020

> Here’s comparative data between M6g and M5, the previous generation instance type

Instead of comparing the 7nm Graviton2 processor against an 14nm Intel processor, I'd like to see its performance compared to an AMD Epyc 2 processor, which would be a more apples-to-apples comparison as both are "7nm" parts. Unfortunately Epyc 2 processors aren't available from AWS yet (but are already announced: https://aws.amazon.com/de/blogs/aws/in-the-works-new-amd-pow...).

Rapzid · on Jan 25, 2020

This is what I'm curious about as epyc currently represents the state of the art x86 right? What's the TCO comparison when factoring in performance, density, and power?

reitzensteinm · on Jan 25, 2020

You're putting way too much emphasis on process tech. TSMC 7nm is roughly equivalent to Intel 10nm, so it's only a generation behind. Intel 10nm products, where available, have also not exactly been lighting the world on fire with their performance.

Dunedan · on Jan 26, 2020

That's why I quoted 7nm. I'm aware of the process differences and Intel's severe and continuing problems with their 10nm process.

My point being that if you want to compare a state-of-the-art ARM CPU, you should compare it to a state-of-the-art x86 CPU and Intel's CPU's are simply not state-of-the-art at the moment.

reitzensteinm · on Jan 26, 2020

To the degree that they're not state of the art (and I think you're overselling that, Intel does pretty well core for core), it's not at all because of 7nm TSMC vs 14nm Intel.

Intel would be doing just fine with a 14nm Ice Lake.

tpurves · on Jan 26, 2020

No. Process tech is profoundly correlated to cpu performance. It’s literallly the definition of Moore’s law. Intel has no 10nm server processors. Intel’s near 5 year delay in getting past 14nm has opened a huge window of opportunity to competitors both AMD and amazon/arm. It is highly worth comparing these competitors apple to apple vs Intel’s oranges.

reitzensteinm · on Jan 26, 2020

"No." yourself :)

Smaller processes used to mean higher frequency switching, lower power and increased density. With the death of Dennard scaling, we mostly just get the latter. This means that the benefit is now largely economic; you get largely the same chips, you can just pack them more tightly on the wafer.

If you're one node behind, you still price the chips at a price the market will bear, they just cost you a bit more to produce. And maybe not even then; mature last generation nodes perform pretty damn well against immature next generation nodes once you take yield and performance in to account.

Intel's 14nm transition yielded Broadwell Xeon [1], barely any improvement over the Haswell chips. Haswell itself, however, gave us a ~50% performance boost on the same process node. This is the difference between a new process node and a new architecture in today's world.

The reason Intel is in trouble is because of the 10nm fiasco, but not because of the lack of a die shrink. Their shrinks have been working like a well oiled machine for decades, and there was no contingency in place for a large delay. All post Skylake chips were being developed tightly against their 10nm libraries, with no possibility of a back port. It's not the lack of a Haswell->Broadwell analogous die shrink that's hurting Intel, but a Haswell->Broadwell->Skylake die shrink + new architecture.

How do you know this is true? Because Intel switched gears and is now decoupling future architectures from die shrinks. If they did this earlier, you'd be seeing Ice Lake (or maybe Tiger Lake) on 14nm++ as an answer to Zen 2, and it would be a pretty good chip. Instead they're doing whatever minor tweaks they can to so many variations of Skylake I'm not sure I could list all the codenames from memory.

Zen 2 is a seriously formidable chip, but most of the benefit came from cleaning up nasty edge cases in performance, like cross core communication being slower than a spill to DRAM. You can't disentangle the shrink from the architecture, because they happened simultaneously.

[1] https://www.anandtech.com/show/10158/the-intel-xeon-e5-v4-re... [2] https://www.anandtech.com/show/8423/intel-xeon-e5-version-3-...

Dunedan · on Jan 26, 2020

> Zen 2 is a seriously formidable chip, but most of the benefit came from cleaning up nasty edge cases in performance, like cross core communication being slower than a spill to DRAM. You can't disentangle the shrink from the architecture, because they happened simultaneously.

AMD seems to disagree. In their "Next Horizon Gaming Tech Day General Session" last year they claimed that ~40% of the Zen 2 performance improvements came from "Design Frequency and 7nm Process", while the remaining ~60% are from "IPC-Enhancements" ([1] slide 13). As the frequency is directly related to the process it's obvious that moving to TSMC's 7nm process played a pretty important role for the performance improvements.

[1]: https://www.slideshare.net/secret/HK00TfQ8ibUlLR

reitzensteinm · on Jan 26, 2020

60% is most, where exactly is the disagreement?

scarejunba · on Jan 26, 2020

Everything I know about process tech says that they're not strictly comparable like that between these different processors. Was that wrong?

awill · on Jan 26, 2020

If Epyc CPUs aren't available, then it isn't apples to apples.

A customer doesn't care about nm. They care about what's available. The apples to apples comparison is The best x86 available vs the best ARM available.

jrockway · on Jan 26, 2020

You can buy Epyc v2 CPUs from Newegg. It's just that AWS doesn't have instance types that use them.

rumanator · on Jan 26, 2020

> It's just that AWS doesn't have instance types that use them.

AWS offers instance types with Epyc CPUs

https://aws.amazon.com/ec2/amd/

jrockway · on Jan 26, 2020

Yeah, but not second generation Epyc.

Their blog says the instance type will be called C5a: https://aws.amazon.com/blogs/aws/in-the-works-new-amd-powere...

Right now, according to the official C5a page, they are "coming soon".

Dunedan · on Jan 26, 2020

Graviton2 based EC2 instances aren't generally available yet either. To quote the product page [1]:

> Amazon EC2 M6g instances are currently in preview and will be generally available soon.

[1]: https://aws.amazon.com/ec2/instance-types/m6/

rumanator · on Jan 26, 2020

> If Epyc CPUs aren't available,

You can buy Epyc CPUs right now even from Amazon, and you can even use Epyc CPUs in EC2 instances.

Dunedan · on Jan 26, 2020

Those are AMD Epyc 1 CPU's, which are built using a 14nm process at GlobalFoundries. Epyc 2 CPU's, which are significantly faster, are announced to be available at AWS in future, but aren't yet.

rumanator · on Jan 26, 2020

Thanks for the info. That's something to keep in the radar.

gok · on Jan 25, 2020

There are some interesting implications for widely deployed processors that are literally never publicly seen because they spend their whole life in a highly locked down data center. I wonder if things like Meltdown could have ever been discovered if researchers could only poke at the chips via EC2.

QuinnyPig · on Jan 25, 2020

I believe the “metal” variants expose the processor extensions you’d need to discover Meltdown.

floatboth · on Jan 25, 2020

Also, AWS processors use off-the-shelf Arm Cortex/Neoverse cores, and stuff like Spectre is core-level.

ec109685 · on Jan 25, 2020

That’s a good point. And AWS allows this type of security research as well freely: https://twitter.com/TeriRadichel/status/1101228943128969218

yarapavan · on Jan 25, 2020

* [James Hamilton] believe there is a high probability we are now looking at what will become the first high volume ARM Server. More speeds and feeds: >30B transistors in 7nm process 64KB icache, 64KB dcache, and 1MB L2 cache 2TB/s internal, full-mesh fabric Each vCPU is a full non-shared core (not SMT) Dual SIMD pipelines/core including ML optimized int8 and fp16 Fully cache coherent L1 cache 100% encrypted DRAM 8 DRAM channels at 3200 Mhz

* ARM Servers have been inevitable for a long time but it’s great to finally see them here and in customers hands in large numbers.

nine_k · on Jan 25, 2020

What really stands out for me is "100% encrypted DRAM".

How efficient is this? Can different cores have different encryption keys, so that different VMs under a hypervisor can't benefit from breaking the hypervisor's protections?

amluto · on Jan 26, 2020

Some Intel chips can encrypt memory with different keys for different VMs. This sounds great for marketing but adds basically no security whatsoever. The feature is called MKTME.

What’s going on here is that “different keys for different VMs” does not actually improve isolation without a considerable amount of hardware or microcode enforcement. AMD has this type of tracking of which VM is which. Intel does not. I don’t know what AMD does.

In any case, exception makes little difference. Cores aren’t bound 1:1 to VMs, so the core can access any VM’s data if it wants. And actually clearing the key on a context switch would require flushing caches and require that there is no cache shared between cores. The performance hit would be extreme.

thu2111 · on Jan 26, 2020

In fairness to Intel they also have SGX which has encrypted RAM and also a lot of isolation logic, as well as working RA, recovery, versioned sealed data and a lot of other things that AMD's equivalent just doesn't do well or at all.

amluto · on Jan 26, 2020

This is true, but you can’t put a VM in SGX without massive software hackery. Also, SGX has been broken so many times in the last couple years that it’s silly.

thu2111 · on Jan 27, 2020

SGX has been broken by totally new classes of attacks and has been successfully renewed via microcode patches every time.

SEV was broken once, completely (at least on EPYC) in such a way that it could not be fixed. From what I understand.

So I'll give Intel a break here. Their performance is much better than AMDs.

The whole point of SGX is that people tried making an entire VM the security surface. That was the prior generation of tech (Intel LaGrande/TXT) and it didn't work. There's far too much code in an entire OS like Linux to make it secure or auditable (and without auditing none of these schemes mean anything).

Enclaves are a design idea that says, shrink the amount of code you have to trust and read to the smallest size possible. Only then do you have a chance of security.

It's unfortunate that this lesson has been learned and is now being lost again.

amluto · on Jan 27, 2020

> SGX has been broken by totally new classes of attacks and has been successfully renewed via microcode patches every time.

As far as I can tell, it’s only “successfully renewed” if you have HT off. If HT is on, SGX is dead.

nine_k · on Jan 26, 2020

What is the point of sharing cache between VMs?

Pinning a VM to a set of cores when encryption is enabled would make sense, and could be a feature cloud users would be willing to pay for.

amluto · on Jan 26, 2020

Basically every recent CPU has a big cache shared between all cores. So, unless you pin a VM to one socket and you do something to encrypt cache coherency traffic between sockets per VM, you lose.

The underlying issue here is that encryption is fast but not fast enough. So no one encrypts cache — instead, plaintext is cached and data is encrypted on its way to DRAM. So the actual isolation is in the access controls that the CPUs apply to which process or VM can access which pages, and this has little to do with encryption.

It’s worth noting that Intel has been very bad lately at protecting cache contents from side channels, while AMD has done just fine. You can turn fancy encryption on, but those side channels leak plaintext.

praseodym · on Jan 25, 2020

I can’t speak for the Graviton2 CPUs, but AMD Epyc CPUs have RAM encryption with per-VM keys for increased isolation: https://developer.amd.com/sev/

wwarner · on Jan 25, 2020

Same question from me. At what point does dedicated hosting become more efficient than encrypted everything?

wmf · on Jan 26, 2020

Pretty much never. With dedicated servers you don't have anyone to split the Nitro overhead with.

Koshkin · on Jan 26, 2020

That seems like another blow to high-performance computing on VMs.

kortilla · on Jan 25, 2020

Is there somewhere to buy something like this for use at home?

wmf · on Jan 26, 2020

Ampere QuickSilver, but don't expect an 80-core server to be cheap.

downrightmike · on Jan 25, 2020

Probably not until end of life for these chips

neonate · on Jan 25, 2020

https://web.archive.org/web/20200125180037/https://perspecti...

DSingularity · on Jan 26, 2020

Amazon will dominate cloud computing with these server CPUs. Who can compete with vertical integration at the sheer scale of AWS? AWS usage patterns tell them exactly what to accelerate with silicon. A process that has been largely driven by Intel will be replaced by a process driven by the customer workloads themselves. These processors will only get better with time.

pdelgallego · on Jan 26, 2020

That is basically Amazon scale playbook 101 (aka flywheel), if there is an efficiency that can benefit the customer (e.g. lowering prices), they will chase it.

It doesn't matter if that means designing Graviton2 or challenging Fedex by trying to build the biggest delivery network in the USA.

pranith · on Jan 25, 2020

I wonder when and how Azure and Google Cloud will compete with AWS in this market.

They could buy ARM processors available in the market, but I doubt they will be able to get them as cheap as AWS who builds their own.

star-trek-fleet · on Jan 25, 2020

I led the support for multi arch images for Borg.

Google had the software stack ready for internal workload longtime ago, PowerPC was used

https://www.forbes.com/sites/patrickmoorhead/2018/03/19/head...

pranith · on Jan 25, 2020

I am sure the software is multi-arch ready but I wonder if they are evaluating ARM servers either for use internally or to launch in the cloud...

gundmc · on Jan 25, 2020

Does Microsoft make any of their own silicon?

It feels like Google has been directing their in-house designs on ML/TPUs while Amazon went all in on ARM. It will be interesting to see how those bets pay off.

floatboth · on Jan 25, 2020

No, but Microsoft bought some off the shelf Ampere and Cavium/Marvell servers. But they keep them for internal use only for now :(

Huawei makes their own silicon and servers with that silicon — also only internal, not available on huaweicloud :(

The only other player is Scaleway who bought first gen Cavium ThunderX's way back when. And Packet of course but that's bare metal only, no cheap small VPSes.

whs · on Jan 26, 2020

Huawei Cloud does have Kunpeng ARM servers available in some AZ (at least I know Bangkok AZ2 has some). They also run managed Redis on ARM so cheap that it will cost more to run it yourself on Intel VM.

I'm excited to see the price drop when Elasticache moves to ARM.

floatboth · on Jan 30, 2020

huh! I see now that they are mentioned on the Chinese Mainland website, but not on /intl.

pranith · on Jan 25, 2020

If they are good enough for internal use, they should be good enough for public use. Not really sure what is stopping them from exposing it to the public... I am sure there is _some_ demand for it.

gundmc · on Jan 25, 2020

There are a number of reasons not to launch as an external cloud offering. A few:

- Reliability (performance and availability) could be below Azure standards

- Supply chain maturity - they may have difficulty scaling procurement and deployment to meet orders

- Lock in - major cloud providers typically provide product guarantees with forenotice on the order of years before a deprecation. It's a big commitment to launch a product externally.

- Business case - maybe the TCO doesn't make sense when compared with Azure's data on demand and price point

Kipters · on Jan 26, 2020

> Does Microsoft make any of their own silicon?

No, but they've been working closely with Qualcomm since the Windows Phone 7 days (10 years ago). Their recent Surface Pro X runs a customized Snapdragon 8cx dubbed "Microsoft SQ1".

I wonder if it could help bringing ARM to Azure.

tambre · on Jan 26, 2020

They did for Hololens 2.0 [0], so they have some expertise, but that's a different segment.

[0]: https://www.youtube.com/watch?v=IjxpMZUqu6c

ksec · on Jan 26, 2020

Or they could make one themselves as well?

It is not like Google or Microsoft does not have the expertise in house for these task. The Core and Interconnect on Graviton2 are licensed from ARM based on Neoverse. It is Fabbed on TSMC 7nm.

While there are still a lot going on with customisation, I would not be surprised if ARM have have a few solution on hand already.

The cost advantage of fabbing its own CPU is so huge, it is only a matter of time Google or Microsoft make their own CPU to compete.

whatsmyusername · on Jan 26, 2020

They could, and probably will... for 3 years and then sunset the product.

I'd considering x86 in their environment but never anything I can't immediately port somewhere else.

Stock ARM maybe. Anything boutique? Nope.

timthorn · on Jan 26, 2020

You can run the same binaries on Graviton as on other Arm server platforms from eg HP or Lenovo, in the same way that you can run x86 binaries on Intel or AMD processors.

jacques_chester · on Jan 25, 2020

I believe Amazon have subbed out manufacture to TSMC.

pranith · on Jan 25, 2020

True, but the design is licensed from ARM and tuned to AWS's requirements.

miohtama · on Jan 25, 2020

This is good news. Are Linux server distributions for ARM64s yet on-par with their PC counterparts? Getting base layer software is not going to be an issue?

QuinnyPig · on Jan 25, 2020

Been using Ubuntu on one for a few weeks; the only things I missed was a few Docker containers that weren’t built, and aws-vault didn’t have an ARM binary. I built my own, and aws-vault shopped a new release with ARM support 20 minutes after I whined about it on Twitter.

Everything else has been flawless.

dehrmann · on Jan 26, 2020

Are there any security fears with virtualization on ARM (think Meltdown and Spectre)? I'd think it's been less studied than Intel's x86-64 chips.

ijl · on Jan 25, 2020

What services are people using to run continuous integration for ARM? I see Travis CI has an alpha. Azure Pipelines doesn't host ARM instances I think.

otterley · on Jan 25, 2020

AWS CodeBuild supports ARM builds: https://aws.amazon.com/about-aws/whats-new/2019/11/aws-codeb...

jedbrown · on Jan 26, 2020

GitLab-CI can use any host (and can be used with GitHub, though the PR integration is not nearly as nice as when used with a GitLab MR).

oldmanhorton · on Jan 26, 2020

Azure Pipelines doesn't host arm yet, but you can run a self hosted pool on a Gravitron2 VM pretty easily.

sdan · on Jan 25, 2020

This is great news. Except that a ton of software doesn't support ARM.

When I was trying to shift all my current infrastructure onto a couple of RPI's, many of the Docker containers didn't support ARM (qeumu and buildx aren't reliable) and other software didn't support ARM either.

Unless there's a good way to go from AMD to ARM, I'm not entirely sure how great Graviton or other competitors will get.

JunkDNA · on Jan 25, 2020

Back in the late 90’s and early 00’s, there were a ton of cpu platforms around: SGI MIPS, DEC Alpha’s, Intel, Sun SPARC, etc... while I will admit it was a colossal pain working somewhere that had all of those, it was often possible to recompile from source to get things to run. I’m not suggesting it’s trivial, but given the incredible investment in ARM in the mobile space, the wind is at least at your back today. It certainly has got to be much easier than it was in the days of being the only person in the world trying to recompile an obscure open source scientific computing package for DEC Alpha. Commercial software is a different beast, but even there, the incentive will be high to do a port if lots of people start migrating to this.

dehrmann · on Jan 26, 2020

Windows NT 4 supported x86, Alpha, MIPS, and PowerPC. Yikes.

wbl · on Jan 26, 2020

All have eight bit bytes with 2's complement, same availible word sizes and float formats (with some complexity on the Alpha due to VAX compat). C code will mostly not care beyond endianness. The PDP is the strange one.

detaro · on Jan 26, 2020

But also, a ton of it already does or can be made to do it, e.g. look at what various distros already have in their ARM variants. Sure, it's not quite as off-the-shelf yet (e.g. because people making random docker containers don't bother yet to build an ARM variant too), but the investment needed isn't that big in many cases, so if those servers offer a compelling reason for you, it can easily be worth it.

ksec · on Jan 26, 2020

Assuming Amazon intend to upstream their work.

All the testing of open source stack Amazon uses internally will support ARM. That is all of their Hosted Open Sources offering. This will kickstart all software support. AWS ARM offers cost advantage which proprietary software now has an incentives or their customer will request ARM support.

All these will create a positive feedback loop into the ecosystem.

fulafel · on Jan 26, 2020

I guess the real benchmark is whether they'll put it to use with Lambda, Fargate etc.

ksec · on Jan 26, 2020

I think Amazon mentioned they intend to use their own chip on All of AWS except their IaaS / EC2 offering, where you still get to choose Server running on x86.

That is why it was mentioned as the fall of x86 on Servers.

fulafel · on Jan 26, 2020

Sounds interesting, where did they say it?

ksec · on Jan 26, 2020

And AWS' initial strategy is to move its internal services to Graviton2-based infrastructure. Graviton2 required significant investment, but AWS can garner returns and improve its operating margins due to the ability to cut out middlemen involved with procuring processors, power savings due to Arm and efficiency gains from optimizing its own infrastructure.

AWS services like Amazon Elastic Load Balancing, Amazon ElastiCache, and Amazon Elastic Map Reduce have tested the AWS Graviton2 instances and plan to move them into production in 2020.

Normally I try to find Primary sources rather than secondary like Zdnet [1]. But I think those exact wording was quite widely reported at the time.

They say they are not Anti-Intel or AMD. Which is true. ( They are only Anti x86. ) And they say the same to UPS and Fedex at the time.

[1] https://www.zdnet.com/article/aws-graviton2-what-it-means-fo...

dehrmann · on Jan 26, 2020

Out of curiosity, what's the state of Jazelle on modern ARM? Would it help server-side ARM, or has the world (and JITs) moved on?

pm215 · on Jan 26, 2020

Jazelle is dead. The v8 version (or maybe even v7; I forget) of the 32-bit architecture basically mandated that only 'trivial' Jazelle (which is the not-actually-there version) could be implemented, and 64-bit has never had anything like it. It was at best a technology of its time (when phone Java implementations were mostly interpreted, not JITs). It would be useless to a modern Java implementation.

thu2111 · on Jan 26, 2020

Jazelle accelerated bytecode interpreting (a bit) but modern JVMs spend nearly all their time running compiled code, so it doesn't really help and was abandoned.

There are CPU HW features that Intel doesn't have which can benefit JVM workloads but they're all pretty obscure and aren't really Java specific.

dman · on Jan 26, 2020

When can I buy something similar for a homelab?

ipsum2 · on Jan 26, 2020

You can buy Cavium Thunderx2s off ebay. They're last gen ARM server chips. The performance won't be as good, but if its just for playing around with, they're more than adequate.

xwowsersx · on Jan 26, 2020

> believing that massive client volumes fund the R&D stream that feeds most server-side innovation.

What does he mean here?

wmf · on Jan 26, 2020

Intel/AMD design laptop/desktop (client) cores then put those cores into server processors. Because vastly more PCs are sold, they effectively subsidize server processors. Arm has a similar advantage, designing cores for phones/tablets and repurposing/extending them for servers.

xwowsersx · on Jan 27, 2020

Ohh I see, thank you!

Can_Not · on Jan 26, 2020

I'm curious about what languages or types of projects are already running on ARM servers in the cloud (and actually benefiting!)

Koshkin · on Jan 26, 2020

Annapurna, the goddess of job security.

taf2 · on Jan 26, 2020

Damn has anyone tried ruby on of these CPU’s is it really 20% perf improvement on nginx? These sound too good to be true

gautamcgoel · on Jan 25, 2020

This is great, but I'd be really excited if we could go out and buy the chips ourselves instead of having to pay the Amazon tax and run our code on untrusted systems in the cloud. Of course, Amazon has little incentive to sell the chips, since it gives them a competitive advantage against other cloud providers.