
How Intel Virtualisation Works - bytefire
https://binarydebt.wordpress.com/2018/10/14/intel-virtualisation-how-vt-x-kvm-and-qemu-work-together/
======
mettamage
I haven't read this article yet, but this is more or less the best moment to
showcase you guys what a friend of mine made. I think it's really cool.

He created a memory allocator in which it is impossible to create dangling
pointers. He used it by becoming the kernel through Intel VT-x (i.e. he uses
ring 0). He uses libdune for this, which in turn uses Intel VT-x.

Check it out at:
[https://dangless.gaborkozar.me/](https://dangless.gaborkozar.me/)

I'm going to write an ascii diagram in the upcoming edit. For now: I'll just
leave you with the legend that my friend made.

Note: my friend made a video and slides. So for people who are interested, his
slides and videos are much nicer to look at than this diagram.

DIAGRAM (of all the physical and virtual memory)

|1|<-A->|2|<-B->|3|<-C->|4|

LEGEND

1 = host physical memory

2 = host virtual memory

3 = guest physical memory

4 = guest virtual memory

A: normal host pagetable

B: embedded page table (this is VT-X thingy)

C: guest page table (this is what I mess with)

~~~
cperciva
I'm surprised that doesn't have a larger performance cost, since it's
requiring a TLB entry for each memory allocation. I wonder if the benchmarks
understate the cost due to being undersized for modern systems.

~~~
drb91
> I'm surprised that doesn't have a larger performance cost,

For what workload?

------
bonzini
It's weird to read a blog post about software that you know in and out. There
are a few inaccuracies here and there but it's very clear and well done.
Kudos!

~~~
bytefire
thank you that means a lot! please do add any information you think is
relevant :)

~~~
bonzini
The bit about TLBs is a bit confusing, it seems like you're taking about a
software TLB but EPT is just a second layer of address translation.

Also, after moving a VMCS from a physical CPU to another you have to do
VMLAUNCH the first time your start the guest on the new CPU, because you had
VMCLEARed it on the old CPU. That's it. :-)

~~~
bytefire
very good, thank you. i'll try to tidy it up

------
burfog
Last I checked, every virtualization driver ignored Intel's overcomplicated
design choice. They don't keep things going; if they did then they would clash
with each other. Instead, they fully shut down virtualization when the VM
isn't running code.

Intel seems to have accepted this state of affairs. On newer chips, it is much
faster to enable and disable virtualization.

~~~
bonzini
No, this is not true. KVM always keeps VMX on, Xen too even when running
paravirtualized guests, Hyper-V does not even have a concept of "the VM not
running code". Maybe VMware Workstation and VirtualBox?

~~~
burfog
Well, I'm part of a team that maintains a VMX driver, and we've looked at what
the competition did. It's been a while since we did that, so change is
possible. Hyper-V might be special.

We could be talking past each other. Here, to clarify, are 3 methods:

x. The driver never does VMXOFF.

y. The driver does VMXON when asked to run a guest. The driver may handle
events from the guest (such as page faults or CPUID emulation) without doing
VMXOFF, but the driver will do a VMXOFF prior to letting other host processes
and drivers run.

z. The driver does VMXOFF every time the VM exits.

We found that choice x was not normally used. If it were, then VMX drivers
would not be able to coexist with each other. I'm not saying that everybody
uses choice z. Choice y is probably also popular.

~~~
mappu
_> If it were, then VMX drivers would not be able to coexist with each other._

Most VMX drivers are unable to coexist (in the sense that VirtualBox, HAXM,
Hyper-V, and VMWare Player are mutually incompatible / can't be used in the
same Windows boot session).

~~~
souprock
I just checked a proprietary VMX driver.

It fully disables VMX before turning interrupts back on.

If I remember right, it works fine with VMWare running on the same machine, so
they must be doing likewise. I think I recall problems with Hyper-V, so you
are probably right about that one. It looks like Hyper-V is the uncooperative
VMX driver that refuses to play nice with others.

------
userbinator
One thing I find annoying about x86 virtualisation is that it already has a
mode called V86, introduced in the 386, but instead of extending that with
more functionality, they introduced yet another set of instructions, and of
course AMD also has its own completely incompatible way to do virtualisation.
The nice thing about V86 is it integrates well with the existing task-segment
model.

~~~
rodgerd
> AMD also has its own completely incompatible way to do virtualisation.

You have this reversed: AMD developed x64-64 virt, and Intel decided to go
their own way.

~~~
tedunangst
But VT-x was released November 2005 and AMD-V released May 2006?

------
ceautery
Since Intel chips are really RISC underneath the hood, I wonder what crazy x86
emulation hoops they have to jump through already.

~~~
bytefire
good point. may be the central idea of how it's implemented isn't too bad: i
see hypervisor as a sort of OS kernel for VMs and the transitions from VM to
hypervisor - VM exits - akin to syscalls. of course there is more but the
above analogy is the basic idea and other things get added along the way

