Hacker News new | past | comments | ask | show | jobs | submit login

QEMU can (slowly) emulate any architecture on any other architecture. In this case, they're using QEMU to emulate x86-64 on ARM64. No nesting or Rosetta is needed.



Why wouldn’t they use Rosetta though? I’d wager the performance of Rosetta would be better than QEMU emulation, but perhaps it’s more optimised for desktop apps


Rosetta is limited to running Darwin/x86-64 user processes on Darwin/ARM64 but Docker needs to run Linux/x86-64 containers/processes on a Linux/x86-64 kernel on a Darwin/ARM64 host.


I think the parent comment meant -- why not use Rosetta to run an x86 qemu process? Then the architecture emulation (translation?) would be done by Rosetta (potentially faster), as opposed to software emulation by qemu.

Now, this might not work, as I'm not sure Rosetta covers all of the x86 instructions/settings that qemu would need, so you might be stuck with ARM64 qemu emulating x86 anyway.


Qemu is only able to achieve native performance when running in conjunction with a hypervisor like KVM. Hypervisors don't do binary translation, so the guest architecture needs to match the host architecture. Running x86_64 qemu under rosetta would likely be much slower than running aarch64 qemu, because it would be running an emulator inside of an emulator.


From the point of view of rosetta, qemu's JIT would be completely opaque, and so would end up suffering severe performance penalties due to it having to translate code from what would appear to be an aggressively self modifying JIT.

That said, assuming Qemu runs entirely in user space I would expect it to be able to run under rosetta, and am genuinely curious if it does, and what the perf is - as I said, I would expect it to be much slower than arm64 qemu emulating x86_64, but I'm curious as to how much.


That being said, Rosetta is somehow surprising OK with JITs–Java has "OK" performance under it.


Efficient use of qemu on x86 requires the Hypervisor framework, which isn't available under Rosetta.

It's possible to run qemu without Hypervisor.framework, but that means it's doing its own second layer of translation. This would be horribly inefficient under Rosetta.


In addition, the CPU settings to have the high performance cores do TSO memory management are not supported under the hyper visor/virtualization layers.


Rosetta 2 doesn't support x86 hypervisor:

https://developer.apple.com/documentation/apple_silicon/abou...

> What Can't Be Translated?: Virtual Machine apps that virtualize x86_64 computer platforms


Rosetta uses Ahead-of-Time compilation of instructions. For running a VM, you need real-time compilation, which qemu can do.


But if they are doing that, then why do they need to use the Mac Hypervisor Framework to setup the VM? That wouldn't be required if you were using qemu, would it?

(How you mention it would be the simplest possible thing that would work)


A Linux VM is still needed because these are Linux containers, ie. they need namespaces, cgroups, layered rootfs, etc.


Of course, but my question is how is it created... MacOS has a Hypervisor framework for creating VMs, which Docker is using. But I don’t know enough about those internals to understand how they are getting an x86 VM on an ARM host. I know it can be done with qemu emulation, but does that still need the MacOS hypervisor framework or does it run as a normal user process?

These are the questions I’m trying to figure out...


       (5) Docker Image      (amd64)
                ^
                |
       (4) QEMU Binfmt       (arm64 <-> amd64 binary emulation layer)
                ^
                |
        (3) Linux VM         (arm64)
                ^
                |
   (2) Hypervisor.framework  (arm64, macOS native virtualization framework)
                ^
                |
      (1) Docker for Mac
Linux Kernel has a feature to allow using a wrapper to execute userspace program based on file header (binfmt[1]). In this case, Linux VM in (3) has QEMU user mode emulation registered as binfmt, so any amd64 binaries are automatically wrapped into `qemu-x86_64-static /path/to/bin` and run. Docker Image itself doesn't run a Linux kernel but use one from the VM host, so this scenario is possible.

This is also how multiarch[2] works (for amd64 to arm64/ppc64le/etc.) which might even be what Docker is using. In case of multiarch, the qemu-*-static binary is provided as a container running in privileged mode.

[1]: https://www.kernel.org/doc/html/latest/admin-guide/binfmt-mi...

[2]: https://github.com/multiarch/qemu-user-static


Step 4 was what I was missing.


The VM is an arm64 VM.


Docker Desktop uses Hypervisor.framework to run ARM containers on ARM. Docker Desktop uses QEMU to run x86 containers on ARM.


Docker Desktop uses Hyperbisor.framework to run an arm vm, which runs arm containers natively, or x86 containers via qemu (in the arm VM).


I don't think that there's a TCG backend for all of the architectures in the frontend. So it's probably not quite any-on-any.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: