
Playing with a Raspberry Pi 4 64-bit - _ananos_
https://blog.cloudkernels.net/posts/rpi4-64bit-virt/
======
Havoc
>Lightweight virtualization is a natural fit for low power devices

No it's not. The pi is punchy enough for it sure, but the above doesn't follow
in my mind. Low power = limited resources so ideally you want it run on bare
metal to minimise overheads

~~~
_ananos_
sure! bare-metal would be awesome. But isn't it a shame not to take advantage
of all the virtualization/containerization goodies out there? After all, the
virtualization overhead nowadays is only referring to I/O (network & storage).
CPU/MEM are being virtualized using hardware extensions at (near-)native
speeds.

~~~
Havoc
>CPU/MEM are being virtualized using hardware extensions at (near-)native
speeds.

Is that really true for rasp level gear? Haven't tested it but my gut feel
tells me the hit is sizable.

Anyway - don't let that dissuade you from the mission. Container goodies on a
rasp is a grand idea...just don't think it's quite as hit free as the article
suggests ;)

~~~
_ananos_
definitely! that's our initial goal. Quantify the penalty and examine the
trade-offs. Clearly, virtualizing workloads on such devices with standard
VMMs/hypervisors isn't ideal. And we're working towards this direction;
playing with the systems stack is what makes us tick, so, it will be a fun and
(hopefully) useful adventure :D

~~~
Havoc
>Quantify the penalty and examine the trade-offs.

If you do have numbers available I'd love to see a write-up on what the real
world hit is.

I used to run a rasp3b for home server but that just wasn't punchy enough
(crappy fake gigabit etc). So my old gaming laptop became a server w/ docker
etc.

But itching to justify a rasp 4.

Anyway...thanks for exploring 64...thought the 32 on rasp 3 was unfortunate.

~~~
bjoli
I have a raspberry pi 4. That thing runs HOT. 64c idle (44c above ambient)
without any case, but with a PoE hat and disabled fan.

The PoE hat makes attaching a proper heat sink (or using the flirc case)
impossible, so I have now rolled my own using copper coins.

I am not comfy with letting my pi run that hot 24/7, and I don't want active
cooling. I bought a rock pi 4 which has a PoE hat that can be combined with a
HUGE heat sink, and there I have no heat issues at all.

There are cases you can use when you don't have a PoE hat, but the original pi
4 case almost throttles the pi at idle.

~~~
Dork1234
I have been testing a rpi4 and comparing it to the rpi3. The Rpi4 is actually
the first Pi that is usable as a desktop computer, but have seen thermal
issues like you. Many the early reviews have only done short tests with the
cover off. The thermal CPU throttling on the RPI4 is huge, while the RPI3
might slow things down 50-30%. The RPI4 will slow down to 100%. In the case I
tested the RPI4 was running close the RPI3 speeds. I don't think a heatsink or
CPU fan is needed there just not enough heat being generated to warrant the
expense.I have had good luck with running the RPI with just the cover off, and
some sort of air circulation in the room.

I have been trying to figure out the cheapest way to cool the RPI4, I am
thinking a single slow 5V fan blowing across the card would be the best
solution. This would also cool off the VLI, and take up less room on the card.
I am just hoping someone will make the case, fan combination the works at a
good price point. If things get too pricey then other SBC look very
interesting.

I am really disappointed the thermal issues were not considered. A simple fan
header like on UDOO boards, or an official case with space and air flow so a
heat-sink could actually work would make me feel a lot better.

~~~
true_tuna
I’m assuming you updated the firmware? There was a fix that lowered the
temperature significantly.

~~~
Dork1234
I haven't installed the latest firmware, is there a way to backup my current
firmware so I can compare them easily?

~~~
unikernelenthu
Try this one:
[https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=2435...](https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=243500&p=1490467#p1490467)

I believe the old binary is there and you can revert using the same mechanism

------
Vogtinator
Currently u-boot does not support the RPi4 properly (some hardcoded registers
and clock numbers), but patches are pending. So EFI support as required by
Arch, openSUSE and some others is not available yet.

Progress for actual mainline support can be followed here:
[https://github.com/lategoodbye/rpi-
zero/tree/bcm2838-initial](https://github.com/lategoodbye/rpi-
zero/tree/bcm2838-initial)

------
LeonM
There seems to be some work-in-progress with supporting AArch64 for the rPi.

This looks like a fun FOSS subject to contribute to once I get my hands on a
Pi4. I like the low-level stuff.

Does anyone here have recommendations where to contribute to?

~~~
Y_Y
By the way, the 3B+ is armv7 by default, but also works fine as aarch64.
That's how I use mine under nixos.

------
nnikoleris
Very impressive. In your RPi4 experiments would you benefit from having more
DRAM if you didn't have the 1GB limit? I'm wondering how much difference the
gic makes. Do you think there is a way to breakdown the performance uplift?

~~~
_ananos_
More RAM could help us with a scale test, or maybe with a storage-related
benchmark.

Regarding the GIC, first I think we should spend some time examining the
benefits from the A72 upgrade. Then, breaking down the time spent on each step
of the VM lifecycle should be fairly straightforward (annotate EL changes,
capture numbers with perf or something similar).

~~~
vardump
I have two 4 GB RPi4s, and can run any tests if you like.

Or is this about some 1GB barrier DMA limitation?

I did notice VideoCore can only access maximum 1GB of RAM. Related to that?

~~~
_ananos_
the issue about the available RAM is this:
[https://github.com/raspberrypi/linux/commit/cdb78ce891f6c636...](https://github.com/raspberrypi/linux/commit/cdb78ce891f6c6367a69c0a46b5779a58164bd4b#diff-634f284364ba43ef69912111615b08ef)

probably some kind of address mapping, but I'm no expert on this stuff ;-)

~~~
vardump
Saw that, but that doesn't really tell why, and how much effort is required to
solve this. Maybe SD stuff is in the VideoCore side, and it simply can't DMA
above 1 GB limit, and requires to have below 1GB DMA buffers + copy for
transfers to upper memory. Maybe.

------
qalmakka
I hope they're going to figure out AArch64 support for the Raspberry Pi soon,
especially since they have launched the models with larger amounts of RAM.
Having a true AArch64 Raspberry Pi would be really nice.

~~~
jchw
I hope we can get true AArch64 support as well, for many different reasons.
One is that Dolphin-emu has an AArch64 JIT but not a 32 bit ARM one. While
it’s probably not terribly hard to get the Pi4 booting to 64 bit, which is
actually probably good enough if you’re just interested in virtualization, I
have no idea how to get the VideoCore 6 drivers in 64 bit and if I had to
guess, I’d guess shit out of luck for now.

(I am not sure Dolphin-emu would run with considerable performance, but damn
if I’m not curious.)

~~~
floatboth
VC4 works great in 64-bit, I've had WebGL running on some browser under Weston
on ArchLinuxARM on a Pi3. I would expect V3D (VideoCore 5/6) to work out of
the box on 64-bit too. Mesa very rarely introduces any unportable code – e.g.
RadeonSI pretty much just works on FreeBSD/aarch64 :)

~~~
jchw
Wait - is the VC4/V3D driver open source? I feel like I missed that bit.

So basically, I should just cross compile RPi Linux and Mesa and it should
work?

To be honest, I have been confused about the VideoCore driver situation from
day 1, but I thought it was a blob.

------
londons_explore
I thought the whole point of things like docker is they gave native
performance? Isn't docker just a combination of cgroups, network namespaces,
pid namespaces, fancy filesystem mounts, etc.

The overhead should be zero by design.

What am I missing?

~~~
_ananos_
there is a number of factors to consider when comparing this kind of
technologies.

Linux containers provide native performance for almost all applications, true;
but there are tons of implications when it comes to multi-tenancy (security,
QoS, trusted execution etc.).

Moreover, spawning an application as a docker container can incur significant
overhead on startup time, fs setup, FS access etc.

So in theory yes, containers provide native performance. But do we really want
to run apps on multi-tenant edge devices as containers ? I would argue not
necessarily ;)

------
anfractuosity
Very interesting! When running KVM on the Pi, does that use virtualisation
extensions if the processor has them? (I don't know anything about ARM and
virtualisation).

~~~
_ananos_
yeap, that's the case with the older models too. A53 has also virtualisation
extensions. The main difference about the Pi4 is the GIC which removes the
need for emulating interrupt handling.

Other than that, the A72 handles VMEnters/VMExits by trapping in/out of
EL0/EL1/EL2.

------
mikorym
I know that this is for 64-bit on the new Raspberry Pi specifically, but would
the Pi 4 be able to run a Playstation 2 emulator?

~~~
vardump
I'm assuming you'd want to have playable framerate.

Maybe. With a herculean effort, way surpassing tricks and techniques used by
x86 PCSX2 and other PS2 emulators.

Pure CPU computational performance wise yes. You'd need to have very clever
ways to emulate the "Emotion Engine", like VPU0 and VPU1 vector processors. A
single RPi4 core would far exceed those, but using it in an emulator might be
very hard, because you'll probably need to synchronize execution and to
emulate PS2 internal memory model. Emulating PS2 300 MHz MIPS core would be a
walk in the park compared with the other issues.

But memory bandwidth might be the true showstopper. I think RPi4 has just one
32-bit DDR4 channel at 2400 MHz, which yields just 9.6 GB/s theoretical
maximum bandwidth. More than what PS2 got, but the margins for emulation are
uncomfortably low.

So while it's "possible", I don't think it'll happen at playable framerates.
It'd be _easier_ to reverse engineer the rendering engines in popular PS2
games and to simply _port_ them.

~~~
mikorym
> It'd be easier to reverse engineer the rendering engines in popular PS2
> games and to simply port them.

What do you mean by that?

~~~
vardump
That the technical challenges and amount of work might be _less_ to port the
games compared to simply emulating the games on this platform.

