Hacker News new | comments | ask | show | jobs | submit login

I ran Linux natively as my sole workstation OS for nearly 10 years, and spent a lot of that time tinkering with WINE, including developing and submitting some patches, but eventually I had to give up because advanced things like Photoshop were too spotty in Wine and too slow in VMs.

My solution was ultimately to set up an Arch-based KVM hypervisor with a Windows 10 VM running as the main "workstation", with USB + GPU PCI passthrough and paravirt. The hypervisor also runs Linux VMs, from which I do development work via VNC and/or SSH.

This is the most convenient workflow situation for me, and allows the best of both worlds. It essentially makes Windows act like a desktop environment for a Linux box while maintaining practically-native overall performance for all workloads, including gaming and photo/video editing. It also grants the admin convenience of virtualized environments, since I can use zvols to snapshot everything at once, place clean resource limitations on each environment, etc.

It would only not work for Linux-based graphics development, but even then, you can get a second GPU and pass it through to another VM, running on a separate display.

Before I got the hypervisor set up, I ran Windows on the hardware with Linux VMs hosted in VirtualBox. The biggest issue with this (aside from the general shame and guilt of using Windows on the hardware) was that Windows would decide it wanted to turn off for MS-enforced updates and bring everything down. Now, Windows is separate and it can crash, reboot, or hurt itself all it wants, and rarely causes any real loss.

Are there instructions anywhere on how to get started with something like this?

Basically you need relativly new hardware. IOMMU is a must. (for intel)

Here is a guide from some other user: https://github.com/saveriomiroddi/vga-passthrough was really helpful but I needed to give up since my mainboard was too stupid and my graphic card didn't worked as I wanted. I could install the OS but it bootlooped after installing any driver.

I used an asus b350-m mainboard with a AMD Ryzen 5 1500x + a radeon r9 280 (thaiti based basically the same as an hd7970) and it didn't worked since my mainboard only had one pci express 16x which meant that the "better" card needed to be used for the main os. You can use the first slot card for the os but it might not always work (hardly hardware dependant). Having a good mainboard is way more important for it to work.

This is a neat solution, but why is OVMF required (according to the guide)?

well you just need an efi firmware and I guess ovmf is the only one that the guy knew of (I actually do not now of any other aswell). if you boot qemu without any firmware it would emulate a real mode device which probably make it impossible to access the device correctly. (not verified or tested, since my rig never worked). but you can install OVMF on any new linux distribution. and you probably don't need to patch it (at least on ubuntu 17.10, fedora 27+, arch linux...)

Edit: Arch wiki (the best) writes something about it: https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVM...

> OVMF is an open-source UEFI firmware for QEMU virtual machines. While it's possible to use SeaBIOS to get similar results to an actual PCI passthough, the setup process is different and it is generally preferable to use the EFI method if your hardware supports it.

OVMF is what QEMU uses for it's virtual guest UEFI to facilitate PCI passthrough, you don't have to flash your physical motherboard or anything.

There are some guides, but it's very hardware-dependent and touch and go. It took me a couple of weeks to get all of the bugs worked out and things running reasonably smoothly. I would not recommend it for the faint of heart, or those without significant sysadmin experience.

On consumer hardware, it can be hard to find out if you even have IOMMU support, required for passthrough. It's not necessarily new but a lot of hardware doesn't support it. Unfortunately for me, my i7-3770k did not have it (but the i7-3770 non-K did). I did the hypervisor build on a new enterprise-class workstation with a Supermicro motherboard.

I use libvirt for the config. I started to build a custom kernel to enable some extra features, since I had to compile dev branches anyway to troubleshoot periodic hard locks on kernels 4.12 and 4.13, but the setup should work with a stock kernel.

For me, the biggest hangups on the checklist were:

a) Ensure 100% UEFI everywhere. It's possible to do this with BIOS but as far as I understand, not well tested anymore. It can also be hard because of the way boards are sort of straddling a middle ground between UEFI and BIOS; if you don't set everything to explicit UEFI in the setup, it may init the system with either, or may init the BIOS first for hardware compat, which will make things weird.

If your video card is slightly older and from the time when UEFI was just getting supported by PC mobos and does not have UEFI boot, you may be able to find a UEFI-compatible video BIOS online. I had to do this for my GTX 670 (I use a 1070 most of the time though, and it did not need this).

b) Kernel VGA options, particularly video=efifb:off. In theory you don't need this but it depends on your board, hardware, etc. Ensures that the video device is available for vfio-pci to grab and block out other drivers that may try to grab it later. The downside is no video out after the bootloader so you can't watch the boot process. I use a USB serial port to watch the boot now, but for a long time I just waited until SSH came up, and if it didn't come up, used the systemd emergency console to try to poke it blind. This made things far harder than necessary. Get serial output ASAP, or use a board with a built-in IPMI. Didn't realize it was possible to order the workstation I got without it, or I would've made triple-sure I had it. Would've made things way less aggravating.

Along the same lines, having multiple GPUs helps a lot with this, especially if you can configure your system and/or move things around in their slots so that the one you don't want to passthrough gets initialized first. Then you don't have to worry as much about competing with the system EFI, bootloader, or kernel for raw control over the device. Not strictly necessary, but useful.

c) nvidia Code 43 on passthrough. This is nvidia trying to extort some sort of ridiculous data center license out of you or something. There are various potential fixes that are somewhat easy to find floating around, particularly QEMU flags, but this is another one that you just have to poke around until you find a combination of options that work. For recent drivers, one of these things is setting the initial substring of your mobo name to a recognized consumer vendor in QEMU/libvirt.

d) For good sound in the Windows VM, need to hack the driver ini to use MSI for interrupts, which requires running in Windows "Test Mode", aka unsigned driver mode. This breaks some anti-cheat software and I returned PUBG because it didn't allow the game to start while Windows is in Test Mode. You also have to set / tweak specific custom args on QEMU and the host's PulseAudio to get the timing right. Without this, the audio drifts very noticeably out of sync. Alternately, passthrough your sound card or use HDMI audio.

Obviously there were a bunch of other little hiccups but this is what stands out to me off the top of my head. Best resources are Arch Linux wiki on passthrough, Proxmox forums, and /r/vfio.

All this said, the biggest source of pain was not related to IOMMU or virt at all, but rather LVM2-based thin provisioning and device mirroring before I switched as much as I could to ZFS (still working on the last few pieces). ZFS is somewhat stricter but it works reliably. LVM would frequently make boot hang, fail to reinitialize the volumes correctly, etc.

Happy to help anyone going down the same road. It's really a great setup once it's running and I'm sure there'd be more envy for it if I had the energy to do a big write up that showcased it to everyone. :P

Here's a sub-reddit as well: https://www.reddit.com/r/VFIO/

> It would only not work for Linux-based graphics development, but even then, you can get a second GPU

My impression is that dGPU "hot" swapping between non-running guests has gotten easier. But that swapping to host after a guest is still a hardware/drivers/kernel "maybe it just works, or maybe you can't get there from here".

Since I read about this a few years ago I really want to try it, but I don't want to buy an extra GPU for it. I hope AMD [0] brings their SR-IOV implementation called MxGPU down to their mainstream GPUs, which allows to split a single GPU between host and guests. Apparently this would also be more secure than passthrough?

In order to not affect their pro GPU sales they could maybe limit the number of virtual GPUs from 16 to 2, which would be enough for the host and a single guest.

[0] or Nvidia, I don't care, but since Nvidia is the market leader they have less incentive than AMD.

On my list is a GPU passthrough setup, which is described in following blog post [1] (with screenshots). I did not set up it yet, but I will try it out next time I build up my home desktop from ground up.

[1] https://davidyat.es/2016/09/08/gpu-passthrough/

Would you be so kind and do a more in-depth write-up of your current setup? It really sounds awesome!

curious about this setup. screenshots?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact