
Redis on the Raspberry Pi: Adventures in unaligned lands - bjerun
http://antirez.com/news/111
======
drej
I never deal with such low level issues, so I don't have to read this, but...
reading these posts by antirez is such a joy. He makes this topic so clear and
understandable, he doesn't assume much, he doesn't use overly complex
explanations, he just "says it like it is" :-)

Thanks!

~~~
hellwd
++ :)

------
drewg123
I fondly remember unaligned access faults "back in the day" with
FreeBSD/alpha. We implemented a fixup for applications, but not for the
kernel. I seem to recall that even though x86 could deal with unaligned
accesses, it caused minor performance problems, so fixing alignment issues on
alpha would benefit x86 as well.

Most (definitely not all) of the mis-alignment problems were in the network
stack, and were centered around the fact that ethernet headers are 14 bytes,
while nearly all other protocols had headers that were a multiple of at least
4 bytes.

I've said it before, and I'll say it again: If I had a time machine, I would
not kill Hitler. I'd go back to the 70s and make the ethernet header be 16
bytes long, rather than 14.

~~~
IgorPartola
Why in god's name did they make it 14?!

~~~
pjc50
Ethernet was invented in 1973 and the first 32-bit processors were available
in 1979.

While you've got the time machine, can you fix it so that "network byte order"
and Intel endianness are the same too?

~~~
swiley
Or rather keep Intel from munging the order their processors write bytes in.

~~~
adrianratnapala
What?

------
blattimwind
There is a funny mode on ARM processors (turned on in some images, by default)
which causes unaligned reads to silently return bogus data (just increasing a
kernel counter).

PowerPC, and really, most non-x86 architectures, do this one way or another.

~~~
faragon
PowerPC (and POWER) has reasonable hardware support for unaligned memory
access, at least for 32-bit data, and if the data is in the data cache.
Depending on the processor, the exceptions that reach the OS can be more or
less frequent.

ARM v6-A and later (except for some microcontrollers, like Cortex M0/R0, that
don't support hardware unaligned access at all, triggering a exception) is
similar to the Intel x86 case (reference in transparent unaligned memory
access -except for SIMD, where x86 can raise exceptions, too, in the case of
unaligned load/store opcodes-), where there is hardware support for unaligned
memory access.

For software that uses intensive non-aligned data access, e.g. data
compression algorithms doing string search, PowerPC, ARM v6-A (and later ARM
Application processors), new MIPS with HW support for unaligned memory access,
and Intel are pretty much the same (i.e. b = * (uint32_t * )(a + 23) will take
1-2 cycles, not requiring doing a memcpy(&b, a + 23, sizeof(uint32_t))).

For SIMD, though, there is no transparent fix, although there are specific
opcodes for aligned and unaligned memory access (e.g. load/store, unaligned
load/store).

~~~
antirez
I would say that ARM v6 and later is a major step forward, but is v8 that
really seems to be similar to Intel finally. The v6 was able to deal only with
single fetch/store unaligned instructions, but things like accessing a double
or multiple words with the same instruction would raise an exception.

~~~
faragon
Accessing unaligned 64 bit data in 32 bit ARM mode can generate exceptions,
even in ARM v8 CPUs when running code in 32 bit mode. Full unaligned memory
access for 16/32/64/128 bits is only guaranteed in AArch64 mode, if I recall
correctly.

~~~
pm215
For 64-bit ARM (AArch64), load-exclusive/store-exclusive and load-
acquire/store-release require aligned addresses. (Seems reasonable to me,
trying to handle atomic accesses to aligned data is no fun). You also get a
fault for any kind of unaligned access to Device memory, but you're not going
to have that unless you're the kernel, and unaligned accesses to hardware
registers are definitely not going to work out very well...

(The rules are all fairly clearly documented in the Architecture Reference
Manual.)

------
throwaway000002
I'm probably the only weirdo that thinks this, but if you support byte-
addressing you'd better as well be happy with byte-alignment. Atomics being
the only place where it's reasonable to be different.

Which brings me to padding. I wonder what percentage of memory of the average
64-bit user's system is padding? I'm afraid of the answer. The heroes of
yesteryear could've coded miracles in the ignored spaces in our data.

~~~
wzdd
> if you support byte-addressing you'd better as well be happy with byte-
> alignment

All ARM processors do this. The concept is called "natural alignment" and it's
pretty common on non-x86. See e.g.
[http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc....](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0472c/BABEEDGH.html)
. The problem here is that a lot of code written for x86 wants more than that,
e.g. byte addressing for non-byte-wide values.

~~~
throwaway000002
I understand. What I mean is that if your word-size is not your addressing-
size, you'd better not have a concept of mis-aligned accesses. It's trouble
you brought on all by yourself.

~~~
tedunangst
The cray did this, iirc, and the result was that char pointers were extra fat
because they needed to include the word address and the byte address within
the word. That's not an efficiency improvement.

------
MrBuddyCasino
Accessing memory locations ending in 0x7? Gather round the campfire folks,
James Mickens has a story to tell:
[https://www.usenix.org/system/files/1311_05-08_mickens.pdf](https://www.usenix.org/system/files/1311_05-08_mickens.pdf)

------
luhn
> Redis is adding a “Stream” data type that is specifically suited for streams
> of data and time series storage, at this point the specification is near
> complete and work to implement it will start in the next weeks.

This sounds like it could be really exciting. Is there anywhere I can find out
more?

Specifically, I've been struggling to find an appropriate backend for HTTP
Server-Sent Events, could this feature help with that?

~~~
antirez
Hello, please check my two Redis Conf 2017 talks on youtube. There is info
about Streams.

~~~
fancy_pantser
Did my enhancement make it into the skip list implementation being used for
the STREAM type? I am hoping it would be in place before you publish
benchmarks for it.

[https://github.com/antirez/redis/pull/3889](https://github.com/antirez/redis/pull/3889)

~~~
antirez
Hello, very interesting! I missed this, just commented on the issue. The
Streams are not based on skiplists, but instead will be implemented using
[http://github.com/antirez/rax](http://github.com/antirez/rax)

~~~
fancy_pantser
Thanks for having a look! I only read the early proposals for the data
structure behind Streams and haven't had a chance to go over the final
implementation. I hope to dive into the source this week and make more
contributions down the road!

------
msarnoff
Recently I've been doing a lot of low-level work with ARMv7-M microcontrollers
(specifically, NXP's Kinetis Cortex-M4 chips) and was quite pleased to find
out that they are pretty lenient about unaligned accesses. To quote from the
ARM Cortex-M4 Processor Technical Reference Manual:

"Unaligned word or halfword loads or stores add penalty cycles. A byte aligned
halfword load or store adds one extra cycle to perform the operation as two
bytes. A halfword aligned word load or store adds one extra cycle to perform
the operation as two halfwords. A byte-aligned word load or store adds two
extra cycles to perform the operation as a byte, a halfword, and a byte. These
numbers increase if the memory stalls."

However, multi-word memory instructions (LDRD, STRD, LDM, STM, etc.) always
require their arguments to be word-aligned.

------
type0
Great article, this project just begs the name of _Redisberry Pi_

------
JefeChulo
In future project I might be interested in the use of Redis for queuing jobs,
this comes very handy to now early the main issues I could get when
developing.

------
amelius
Could Rust's typesystem catch unaligned pointer dereferences?

~~~
bbatha
Sort of, Rust is supposed to make references to packed structure members
unsafe, but currently doesn't. An RFC was accepted to change the behavior but
it has not been fully implemented. Here's the tracking issue:
[https://github.com/rust-lang/rust/issues/27060](https://github.com/rust-
lang/rust/issues/27060)

------
dis-sys
wondering what kind of performance overhead it is going to cause by letting
the kernel to handle unaligned access vs. fixing the software to actually
always use aligned access?

------
crncosta
Nice article!

------
k__
OT: Is blattimwind shadow banned?

~~~
retox
No, but posting while green will usually get your comment downvoted to
oblivion, even if you are erudite and contribute to the conversation.

Turn on "show dead comments" and see how many greens are deleted. I screenshot
many examples.

~~~
taneq
Is this cause (ie. people downvote greens out of prejudice) or effect (greens
are often created to shitpost?

And to concentrate all my meta in one place... Is shadow banning a thing at
HN? I thought they just, well, banned you.

~~~
Rjevski
OT - new user here, what does green mean? I assumed it was staff/moderators.

~~~
mbel
It means new user (hence green color).

~~~
jancsika
I just assumed green meant admin or some kind of paying customer.

I'd venture to guess green users could use this ambiguity to play subtle
rhetorical tricks on users with a moderate number of points here.

