
AWS EC2 Virtualization 2017: Including Nitro - brendangregg
http://www.brendangregg.com/blog/2017-11-29/aws-ec2-virtualization-2017.html
======
ksec
Turns out Nitro is not only an improved KVM but also work exclusively with
their Nitro Custom Silicon. [1]

And we are finally here, the true cloud, elastic computing with little to no
performance overhead.

[1]:
[https://www.theregister.co.uk/2017/11/29/aws_reveals_nitro_a...](https://www.theregister.co.uk/2017/11/29/aws_reveals_nitro_architecture_bare_metal_ec2_guard_duty_security_tool/)

~~~
Annatar
We were here back in 2005 with the introduction of Solaris zones, which now
live on in SmartOS. Took GNU/Linux “only“ 12 years to begin to catch up. This
might be new to you, but it’s nothing new to those of us who built private
clouds back in 2005/2006\. Just don’t think it’s new because it isn’t.

~~~
Erwin
You know the author of this piece is Brendan Gregg, right? DTrace author,
performance guru, ex-Sun engineer and lead performance engineer at Joyent. And
now a full time Linux guy. He doesn't think it's so bad apparently!

~~~
Annatar
Yes, I know that; we even corresponded privately. However, call to authority
does not magically validate Amazon, nor does it magically invalidate my
points. Nothing Brendan wrote contradicts the points either of us made.

~~~
openasocket
What is your point, exactly? That we shouldn't care about hardware-backed
virtualization because we have zones? That AWS should go home and re-write EC2
to just be zones instead of VMs?

~~~
Annatar
_What is your point, exactly? That we shouldn 't care about hardware-backed
virtualization because we have zones?_

Yes!

 _That AWS should go home and re-write EC2 to just be zones instead of VMs?_

There is no reason for AWS to exist.

~~~
openasocket
Zones aren't a replacement for VMs. With VMs you can have different OSes, for
example, which is not generally possible with a zone (you can do some stuff
with syscall routing to do this in certain ways, but requires support from the
root container). In a zone you can't do a variety of things that require
higher level kernel access, like installing custom kernel modules. Zones also
have a higher attack surface, which matters quite a lot when you are dealing
with a multi-tenancy situation. But I'm pretty sure you knew that already.

~~~
Annatar
If you need another OS besides SmartOS for serving data, you have far bigger
problems.

Kernel modules are only meant to be installed in the global zone, on the
hardware. This is by design. If you need access to that particular device, it
can be delegated to that zone with the add subcommand to zonecfg(1M).

Zones do not increase attack surface, I don't know what would make you think
that. They reduce the attack surface because one can set up read-only zones,
good luck changing anything in there. In a zone, attacker lands in a jail,
assuming they actually manage to break in!

What strange times we live in when there is so much misconception. So much
knowledge lost! No wonder AWS and Linux rule the day!

Zones are a revolutionary technology in terms of network security. What makes
you think zones increase attack surface?

~~~
openasocket
Zones increase attack surface compared to a hypervisor because you have a
shared kernel, and every syscall is a potential attack surface. Compare that
to a hypervisor, which a much simpler API exposed with far less functionality.
Hypervisor vulnerabilities are far less common than kernel vulnerabilities for
that reason. This is fairly well known in the security space.

In fact, if you look into Joyent, you'll see they don't use regular zones for
multi-tenancy, but rather a KVM-inspired hypervisor implementation of zones.

~~~
solarengineer
SmartOS provides Zones, and if you need it a KVM instance within a Zone.

Hypervisors have their own security issues. They're just programs. That's why
there've been all the patching over the years for Xen and for KVM.

Last year, an independent researcher did find an issue related to Dtrace that
Joyent patched. However, the security and engineering that backs Zones is way
more than with other kernels.

[https://www.joyent.com/blog/dtrace-conf-16-wrap-
up](https://www.joyent.com/blog/dtrace-conf-16-wrap-up)

DTrace exploitation:
[http://slides.com/benmurphy/deck#/](http://slides.com/benmurphy/deck#/)

[http://benmmurphy.github.io/blog/2017/01/06/arbitrary-
kernel...](http://benmmurphy.github.io/blog/2017/01/06/arbitrary-kernel-
memory-reads-on-illumos/)

[https://vimeo.com/173300650](https://vimeo.com/173300650)

------
zimbatm
I wonder what measures are in place in the bare-metal edition to protect
themselves from firmware attacks. What protects them from possible attacks
against BIOS flash or potentially infect the CPU and HDD chips?

~~~
_msw_
There is a Nitro security chip that is integrated into the motherboard. SPI
and I2C buses are routed through the security chip. During system reset, the
security chip holds motherboard components in reset while firmware is
inspected out of band. So all contents of flash are verified.

Local NVMe storage is also handled through our Nitro hardware, and the
underlying physical flash devices are protected from firmware modifications.

[https://www.youtube.com/watch?v=o9_4uGvbvnk](https://www.youtube.com/watch?v=o9_4uGvbvnk)
provides an overview.

------
jhgg
Am super curious to see some benches on AWS Nitro compared with GCP's Skylake
VMs.

------
saurik
This article talks about the underlying technical differences between various
classes of EC2 computer... is this documented by Amazon anywhere? I would
totally care more about replacing stuff if I knew "ok, there is some new
virtualization difference in this new set of machines" that I could analyze
against my use cases instead of "I have no clue what if anything is different,
other than the pricing is a little different".

~~~
TheDong
Some of this is fairly well known if you happen to be quite familiar with Xen.

If your use case is "I don't want to understand Xen", then fine, just use AWS
and pick the newest instances, they're usually the best.

AWS does have some words about PV vs HVM here:
[http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/virtualiz...](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/virtualization_types.html)

But really, they don't say much and that's fine. If you just want to run
stuff, you probably don't need to understand the deep details of Xen or PV
drivers or whatever; the cpu speed, memory, and "disk io"/"cpu credits" they
document quite well will be what matters significantly more.

If you are at the scale of netflix, then it does make sense to figure out an
extra several percent disk performance.

> I would totally care more about replacing stuff if I knew ... instead of "I
> have no clue what if anything is different

They almost always have blog posts accompanying each new instance launch which
roughly outline the performance improvements and reasons that instance type is
cool vs previous ones, even if they don't go into the depth of virtualization
differences so often (which again, usually doesn't matter! for nitro, the C5
blog _did_ talk about the visualization stuff in depth
[https://aws.amazon.com/blogs/aws/now-available-compute-
inten...](https://aws.amazon.com/blogs/aws/now-available-compute-
intensive-c5-instances-for-amazon-ec2/))

~~~
saurik
I knew about PV vs. HVM, as that is documented. However, there is a ton of
nuance in this article about every instance type and specific changes to what
parts of the system are virtualized (and in particular stuff with EBS, which I
stress heavily). I guess you are saying that people need to essentially find
all of the old blog posts from AWS to figure out any of this, but why isn't it
just documented? I clearly _am_ someone who cares a lot about these things, or
I wouldn't have posted the question I did: but Amazon's instances are
essentially just black boxes. I am hoping that there is somewhere in the
documentation that I don't know about that just shows the differences between
all of these instance types in a few centralized pages.

------
andrewstuart
Does this allow nested virtualization?

ie the ability to run VMs within an EC2 instance?

~~~
aliguori
C5 does not support nested virtualization but i3.metal allows using
virtualization technology without nested virtualization.

Both i3.metal and c5 use the same underlying Nitro technology.

~~~
bkeroack
Wait, so even the "bare metal" i3.metal sits under a hypervisor? It would seem
inaccurate to call it "bare metal" in that case.

~~~
gregdunn
No, Nitro refers to the whole technology stack as well as the hypervisor.
*.metal instances utilize most of the Nitro technology stack, but not the
Nitro hypervisor.

~~~
bkeroack
Ah, ok. That makes sense. Thank you for the clarification.

~~~
gregdunn
You're welcome!

[https://www.youtube.com/watch?v=o9_4uGvbvnk](https://www.youtube.com/watch?v=o9_4uGvbvnk)
might be of interest - Matt Wilson talks a lot about the underlying magic of
the bare metal instances.

------
aedocw
How much of this went back upstream to improve KVM itself? Or is all of the
nitro work just special sauce on top that they're not required to share?

~~~
_msw_
[https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=author&q=amazon)

------
SteveNuts
Can you use a regular AMI on the bare metal instances?

~~~
gregdunn
AMIs that work on c5 instances should work fine on bare metal instances. NVMe
and ENA support will be the big two things you need.

