
64 bit OS Raspberry Pi4 Benchmarks - tosh
https://medium.com/@matteocroce/why-you-should-run-a-64-bit-os-on-your-raspberry-pi4-bd5290d48947
======
DCKing
The CPU of the Raspberry Pi 4 is the only 64-bit ARM chip I've ever seen that
doesn't include native AES instructions. As a result, it's much lower
performance for network or disk encryption use cases.

Up until the RPi 4 I thought AES instructions were a part of AArch64, but I
was wrong. Such a weird omission to make on (I expect) Broadcom's side. All
other 64-bit ARM SBCs just have it, even the low cost ones.

~~~
fulafel
AES is quite efficient in software. The RPi4 should be able to saturate its
network connectivity many times over with SW AES.

A Pentium II 200 MHz could saturate 100 Mbps with AES-128[1], RPi4 has 4 x
1400 MHz cores (with 64-bit ALU & NEON available).

[1]
[https://www.di.ens.fr/~granboul/recherche/AES/timings.html](https://www.di.ens.fr/~granboul/recherche/AES/timings.html)

edit: There are RPi4 AES and other benchmarks here in this "openssl speed"
paste:
[https://gist.github.com/HimaJyun/f05d3017dfb05a4ccb0def010bb...](https://gist.github.com/HimaJyun/f05d3017dfb05a4ccb0def010bb2c91a)
\- indicates 50-80 MB/s = 400-640 Mbps per core.

~~~
wolf550e
Fast table based software AES implementations leak bits of the secret key in
array offsets, so anything that can use the cache on the chip can get the key.
A software implementation of AES that is secure against side channel attacks
by using bitslicing is slow, you would be better off using chacha20-poly1305.

~~~
fulafel
So you're saying OpenSSL might still be using side channel vulnerable AES
code, all these years after there was a big row about it?

This ARMv8 impl according to comments is constant time:
[https://github.com/openssl/openssl/blob/master/crypto/aes/as...](https://github.com/openssl/openssl/blob/master/crypto/aes/asm/vpaes-
armv8.pl)

There are also other ARM AES implementations in the tree:
[https://github.com/openssl/openssl/tree/master/crypto/aes/as...](https://github.com/openssl/openssl/tree/master/crypto/aes/asm)

AFAICT one is for the hardware AES instructions, and one is a bitsliced (so
constant time) impl for 32-bit ARMv7, and one is for low end ARMv4. The latter
sounds like it might still be vulnerable to timing sidechannel attacks but no
idea if it is still used by default in any configuration...

~~~
wolf550e
I don't know about openssl's status on old / weak platforms. People who need
AES without hardware AES instructions should be using BearSSL code or the rust
code that has been reviewed by the author of BearSSL (see e.g.
[https://github.com/RustCrypto/block-
ciphers/issues/65](https://github.com/RustCrypto/block-ciphers/issues/65)).

------
joecool1029
One of the main benefits you get to running a proper aarch64 userspace is
ability to run a modern functional firefox install. The 32-bit builds usually
don't support anything newer than a very old LTS release. It's been hell
trying to build and/or run that target (see:
[https://bugzilla.mozilla.org/show_bug.cgi?id=1452128](https://bugzilla.mozilla.org/show_bug.cgi?id=1452128)
)

This is especially important if considering use as a low-end desktop as
Chromium will eat through 4GB of ram like it's nobody's business.

~~~
drmpeg
I'm running Firefox 73 on my BeagleBoard-X15 (Cortex-A15). Ubuntu 18.04.

$ dpkg -l | grep firefox

ii firefox

73.0.1+build1-0ubuntu0.18.04.1 armhf

Safe and easy web browser from Mozilla

~~~
joecool1029
So that's why it's showing up with 59 being the latest versions from the
distro for arm/arm64?
[https://packages.ubuntu.com/bionic/web/firefox](https://packages.ubuntu.com/bionic/web/firefox)

~~~
drmpeg
It's in bionic updates.

[https://packages.ubuntu.com/bionic-
updates/firefox](https://packages.ubuntu.com/bionic-updates/firefox)

[http://ports.ubuntu.com/pool/main/f/firefox/firefox_73.0.1+b...](http://ports.ubuntu.com/pool/main/f/firefox/firefox_73.0.1+build1-0ubuntu0.18.04.1_armhf.deb)

------
rgovostes
One of the things that I find great about Raspbian is that the boot files are
stored on a FAT32 partition which is readable from any OS, and there are some
nice considerations for headless setups: drop in a wpa_supplicant.conf to
configure Wi-Fi, `touch ssh` to get OpenSSH enabled at next boot. Do any of
the alternative OSes, like Ubuntu, offer this?

~~~
4d617832
So that's how they came up with that. As someone who normally installs
"normal" Linux systems I find it quite irritating that you have to put a file
somewhere, especially the boot record, to "enable" ssh of all things. Up until
now I considered it a weird decision. (and I still think it is not optimal) I
deploy my RPi's in the field and don't put a monitor on them so I would expect
ssh running as default. First time I found out about it was when reading the
unit file when I was building a custom image based on raspbian, so I wouldn't
consider it obvious :) When working on a Linux Device I just mount the main
partition and do my customizations.

~~~
Maxious
Raspbian did originally come with SSH enabled by default but the default
credentials pi/raspberry made it trivial for misuse:
[https://www.zdnet.com/article/linux-malware-enslaves-
raspber...](https://www.zdnet.com/article/linux-malware-enslaves-raspberry-pi-
to-mine-cryptocurrency/)

~~~
4d617832
Probably the right decision then. As I don't put them on public networks and
delete the pi user this is of little concern to me, but given the target
group, it is a simple safety measure.

~~~
proverbialbunny
I think there is a better solution: On a new install on first login, over ssh
or on the gui, a user/password must be created.

This way the initial login only works once. Both gui user/pass and ssh
user/pass are tied by default.

~~~
foxrider
This is how it works in ARMbian. It forces a password change on first login.
It can be annoying if you intend on deleting the alarm user right after that,
but I can easily see why. "Default" passwords are always suboptimal.

------
gok
One downside: you will probably need a bit more memory to do the same work. An
unmentioned benefit: better ASLR, for what that's worth.

> A 64 bit system means that RAM can be accessed in 8 byte read/writes per
> instruction.

I mean...kind of but that's probably not really what's happening in this
benchmark. Firstly, in that test `memset()` is surely using NEON instructions
internally on both ARMv7 and AArch64, which can load/store up to 32 bytes in a
single instruction. Further that test is really just showing the bandwidth of
the memory controller. I'm not sure why AArch64 would matter there. It's
possible that `memset` / `memcmp` are using smarter prefetching instructions
in AArch64.

~~~
m4rtink
I wonder if using zram or zswap can help "eat" those overly long pointers in
RAM via compression ?

~~~
cellularmitosis
I've never understood why zswap decompresses the pages before writing them out
to disk. Disk IO is the slowest part of the whole chain, you've already paid
to compress the data, why on earth would you throw that away AND use up more
disk IO?

~~~
m4rtink
My guess is that this is due to some inherent limitations of the Linux kernel
swap mechanism ? Possibly so that you can safely disable zswap at runtime (you
can) ?

Or quite possibly no one implemented it just yet & writing uncompressed pages
is transparently compatible with what normal swapping does.

------
CivBase
Noob question: What kind of optimizations are even responsible for those kinds
of performance improvements?

32 bits is enough to utilize all 4 GB of the Raspberry Pi4, so I figured the
only benefit of using a 64 bit OS would be to support 64 bit software. Why
would a 64 bit build perform better than a 32 bit build on the same hardware?

~~~
yalok
64-bit ARM NEON (SIMD instruction set) is much more efficient vs 32-bit. And
not just NEON, there are more registers as well, deeper pipeline, etc.

~~~
xeeeeeeeeeeenu
Also, 32-bit ARMs often (but not always) don't have hardware integer division.

~~~
pm215
Integer division is mandatory from v7VE on upwards (so roughly Cortex-A9, A15
and later), and it's definitely in the 32-bit support of any 64-bit capable
core. If Raspbian is compiling its packages for the lowest common denominator
arm v6 cpu some Pis have, then you'd get some slowdown from division being an
out-of-line call (to library code that can use the hw insn), but you'd see
that part of the speedup just from building for 32-bit but optimised for newer
cores than v6, I think.

~~~
my123
Cortex-A9 does not have hardware division. Cortex-A7 and Cortex-A15 onwards
do.

~~~
pm215
Yes, you're right, I misremembered. (It came in with CPUs with the
virtualization extension, so not A9.)

------
threatofrain
There's also 64-bit Ubuntu Server and Core.

→
[https://www.raspberrypi.org/downloads/](https://www.raspberrypi.org/downloads/)

------
rubyn00bie
Slightly off topic, but kind of on theme at least: does anyone know where to
get an SBC (single board computer) with 6GB of RAM... or even 4.5GB (that
doesn't cost like $~300)? The raspberry PI is almost perfect, I just need
slightly more ram per unit.

ARM or X86 I really don't care...

~~~
thrwaway69
You should get an old server rack at that price point unless space is
important for you.

Udoo - [https://shop.udoo.org/udoo-x86-ii-
ultra.html](https://shop.udoo.org/udoo-x86-ii-ultra.html)

LattePanda - [https://www.lattepanda.com/products/lattepanda-
alpha-864s.ht...](https://www.lattepanda.com/products/lattepanda-
alpha-864s.html)

There are a few more. You could also build your own. All you need is a micro
ITX motherboard + ram and a processor and something to power. Get them used
and it will be way more powerful.

~~~
rubyn00bie
Yeah; I just want something smaller and easier to toss around than the giant
ass server I have sitting next to me.

I've got a LattePanda v1 but I've been a bit disappointed with it. Udoo bolt
is my next test but they're just a bit more expensive than I'd like.

------
0xcoffee
Apparently in `config.txt` you can set `arm_64bit=1`

Then `sudo rpi-update`

~~~
echlebek
This results in a 64-bit kernel, but 32-bit userland, whereas the arm64 Debian
is entirely 64-bit.

~~~
MuffinFlavored
Do you need the `arm_64bit=1` flag to get 64-bit kernel, 64-bit userland or
no? I would have thought 64-bit everything would have been the default.

~~~
Zaskoda
I just gave it a try and now apt is installing a bunch of new kernel .img
files. So for me, yes, this seemed to be required before 64bit kernels were
installed. I'm feeling surprised. I'm 40% of the way through the install and
hoping it reboots okay in the end.

Edit: to be more clear, I also run 'apt update' afterwards

~~~
Zaskoda
Update: oh yeah, that broke a lot of stuff

------
louwrentius
So running the Pi in 64-bit is really overall beneficial. I'm running a
customised version of Ubuntu for the Pi4.

[https://ubuntu.com/download/raspberry-
pi](https://ubuntu.com/download/raspberry-pi)

I've also customised the image by adding users, public keys and such. Removing
some of the cloud cruft.

These instructions make it very, very easy and you can do this on an x86
machine. Just make sure to use /usr/bin/qemu-aarch64-static (64 bits) instead
of qemu-arm-static (32 bits).

[https://powersj.io/post/raspbian-edit-
image/](https://powersj.io/post/raspbian-edit-image/)

~~~
mlyle
The flipside is-- if you have a 1GB pi, you're going to be wasting a whole lot
of RAM on wider pointers, and you may be more constrained by RAM than CPU
throughput.

(Also, 64 bit Pi is not nearly as well supported as 32 bit currently. Does
hardware GL work yet? What about if you're going to do MIPI, etc?) IMO it's
worth waiting for a little more maturity.

~~~
DagAgren
Has nobody tried to implement an aarch64 equivalent of the "x32" ABI? That is,
using the 64-bit instruction set but keeping pointers 32 bits long?

~~~
mlyle
There's Arm64ilp32, but I don't think it's in the mainline kernel and it's
really bleeding edge/strange.

When aarch64 is so badly supported on pi, it's really scary to go even further
into the fringe with Arm64ilp32.

------
technofiend
I realize I'm a little late to the party as this post is almost a day old,
however as a Pi enthusiast I feel obligated to mention RackN's edgelab [1]
project which leverages Pi4's PXE boot to rapidly build a 64-bit mini Pi lab
and then a k3s cluster on top.

Full disclosure: I don't work for RankN but am a customer; my use case is
zero-touch ESXi cluster and Linux builds but I like the tool and have way too
many Pis.

[1]
[https://github.com/digitalrebar/edgelab](https://github.com/digitalrebar/edgelab)

------
rcarmo
I’ve been running Ubuntu on most of mine for a while now, but I go back to
Raspbian now and then for the hardware support (graphics, in particular, are a
bit of a pain, but I’ve also had issues with the built-in Wi-Fi). My “lab” Pi
4 is on Raspbian also because I like noodling in Mathematica (which I don’t
think you can get in ARM64 at all for any distro).

It would be awesome to have a decent ARM64 SBC with a good GPU (able to drive
2 4K monitors and run Firefox/VS Code). Any recommendations from the non-Pi
crowd?

~~~
kevinastone
Have you explored the Nvidia Jetson Nano[0]? It runs Ubuntu Aarch64 as its
default OS.

[0]: [https://developer.nvidia.com/embedded/jetson-nano-
developer-...](https://developer.nvidia.com/embedded/jetson-nano-developer-
kit)

~~~
camccar
Last time I booted up my Jetson Nano, it was so slow. When I would ssh in it
would take like two seconds to get a response after typing ls. Obviously this
was sd card reads. It just doesn't have the polish/optimisations raspberry pi
has.

------
zamadatix
Has anyone gotten the aarch64 Raspberry Pi image of Alpine to work on the Pi
4? It seems like a good fit for a RAM constrained system but I could never get
it past the colorful boot screen.

~~~
nightfly
It's fairly easy to trim down what runs on Raspbian. so if your only concern
is background tasks eating RAM, you don't need something like Alpine to fix
that issue. We use Pis at my work for running slideshows, and while performing
their duty they use about 123MB of RAM for all running software (mostly our
software + monitoring tools).

~~~
zamadatix
Raspbian doesn't offer aarch64 userspace though.

------
alexellisuk
Everyone's asking for the Debian image / instal instructions. I'm curious too,
from when this post was shared originally and trended in early Jan.

~~~
Legogris
sakaki- has a build guide here: [https://github.com/sakaki-/gentoo-on-
rpi-64bit/wiki/Build-an...](https://github.com/sakaki-/gentoo-on-
rpi-64bit/wiki/Build-an-RPi4-64bit-Kernel-on-your-crossdev-PC)

And weekly kernel builds here:
[https://github.com/sakaki-/bcm2711-kernel](https://github.com/sakaki-/bcm2711-kernel)

She focuses more on Gentoo, but it's mostly the same - just pop it onto an
existing Debian 64 image.

------
lasermike026
Add a heatsink and fan. You'll get better performance.

~~~
joecool1029
Just run current firmware from last september or later and keep the pi4
standing up on its side naked, it will never throttle no matter what you throw
at it. (unless maybe your room is 80F or more)

I've had my pi4 compile stuff for days on end and it does not throttle if the
above setup is followed.

~~~
guug
What did the firmware change to enable this kind of performance?

~~~
XelNika
Better power management has reduced overall power consumption by almost a
watt.

[https://www.hackster.io/news/raspberry-pi-4-firmware-
updates...](https://www.hackster.io/news/raspberry-pi-4-firmware-updates-
tested-a-deep-dive-into-thermal-performance-and-optimization-2f22c78e7089)

------
muth02446
what would be ideal for raspi 3 and up are aarch64 binaries with a 32 bit
memory model (aarch64-ilp32). Dies debian support/use this?

~~~
pm215
As far as I'm aware, ilp32 support for aarch64 has been proposed and
implemented, but the patches weren't accepted upstream in the kernel because
nobody was able to make a sufficiently convincing case that there was enough
benefit to justify adding a whole new ABI to the kernel (there are a few
benchmarks where it's noticeable but mostly it just doesn't make enough
difference to be worthwhile, AIUI). Possibly the situation has changed since I
last heard about it.

------
moonbug
I think the main issue keeping the pi with a 32b userspace is the lack of
availability of 64bit GPU firmware

~~~
joecool1029
> I think the main issue keeping the pi with a 32b userspace is the lack of
> availability of 64bit GPU firmware

Uh, no. It's there because of the design goals and priorities of the
Foundation. They want to be able to distribute a distribution that runs on any
raspberry pi generation. Performance is of secondary concern to them. So
Raspbian remains at 32-bit.

~~~
guug
Can you cite a source for this claim?

~~~
joecool1029
Why yes, yes I can:
[https://www.raspberrypi.org/forums/viewtopic.php?t=252369#p1...](https://www.raspberrypi.org/forums/viewtopic.php?t=252369#p1539974)

~~~
slumos
> this will be combined with a 32bit userland, for the reason mentioned above.
> - the amount of work needed to update all the libraries that talk to the
> GPU.

~~~
joecool1029
My eyes kinda glaze over when I see that because the vc4-fkms-v3d driver works
fine enough with mesa on 2D and 3D shit using a full 64-bit system. Maybe it's
not as fast as the original proprietary driver, I don't know as I've never
used the original proprietary one nor have I seen anyone benchmark it.

