
Systemd is bad parsing - fanf2
https://blog.erratasec.com/2018/10/systemd-is-bad-parsing-and-should-feel.html
======
ambrop7
Yes the code is bad but does not need to be completely rewritten. It should
interpret the data as an array uint8_t (no casting to structure/integer
pointers!), use simple helper functions to read out (u)int(8/16/32) values
(there was a post on HN about this recently,
[https://commandcenter.blogspot.com/2012/04/byte-order-
fallac...](https://commandcenter.blogspot.com/2012/04/byte-order-
fallacy.html)), and be careful to check things like sizes.

The code is also wrong because of strict aliasing. This is a real problem,
your program can in fact exhibit undefined behavior because of this (it
happened to me).

Some time ago I wrote some code to make these things simpler in C++. It allows
you to define "structures" (which however are not real structs, it's an
abstraction) and then directly access an arbitrary buffer through
setters/getters. The code is here with documentation:
[https://github.com/ambrop72/aipstack/blob/master/src/aipstac...](https://github.com/ambrop72/aipstack/blob/master/src/aipstack/infra/Struct.h)
. In fact this is part of my own TCP/IP stack and it includes DHCP code (here
are my DHCP structure definitions:
[https://github.com/ambrop72/aipstack/blob/master/src/aipstac...](https://github.com/ambrop72/aipstack/blob/master/src/aipstack/proto/DhcpProto.h)).

~~~
jstimpfle
As an array of uint8_t? I'm pretty sure that's wrong, and that it should be
char instead. Usual OS network APIs are char based, and the fact that the
network is technically about "octets" shouldn't matter behind those APIs. Or
is my thinking wrong?

~~~
alxlaz
uint8_t is guaranteed to always be an octet, and stdint types (uint8_t,
int32_t, uint32_t etc.) are also the recommended types to use when you want to
make the size and signedness of the data clear. An array of uint8_t is
definitely correct, and it's also the right thing to do in 2018. An array of
char is something you should only use to store human-readable character in
array form.

Many OS network APIs are "char based" because the C standard requires that
sizeof char be exactly one byte, so it used to be a convenient shorthand, but
that's not the case anymore.

~~~
viraptor
> the C standard requires that sizeof char be exactly one byte

That's at least 1, not exactly 1.

It's POSIX that requires char to be 8bit.

~~~
alxlaz
Nope, the C standard requires that sizeof char = 1, regardless of the width of
a byte :-).

~~~
jstimpfle
One of the things I was not sure about is whether the size of uint8_t is
guaranteed to be 1 as well. And whether that means the both are pointer-
compatible. And if so, where is this specified?

~~~
alxlaz
As far as I know, C99 makes no requirements about the _byte_ width of int8_t,
only about its bit width. The relevant bits are in section 7.18.1.1 .

There are few architectures and runtimes where this is relevant anymore, and
the answer is really best given on a case-by-case basis.

I think it's technically possible to make the two pointer-compatible on
architectures with wider bytes. Last time I used one it was Analog Devices'
SHARC, and I think they were pointer-compatible, but that was a long time ago,
so take it with a grain of salt.

------
jmartinpetersen
The article isn't really about the bug but about the (lack of) quality in the
code. Spoiler alert: He doesn't like it.

"This code has no redeeming features. It must be thrown away and rewritten yet
again. This time by an experienced programmer who know what error codes mean,
how to use asserts properly, and most of all, who has experience at network
programming."

~~~
tinus_hn
If the rest of the code looks anywhere near similar to the examples he shows
he’s right, this is terrible code and it should never have been allowed in a
serious project in this day and age.

~~~
pbhjpbhj
Perhaps it's a choice between this and nothing?

~~~
NelsonMinar
If only there were some other DHCPv6 client that we could use.

~~~
pbhjpbhj
I meant that perhaps noone else came forward to do the IPv6 DHCP module coding
for systemd-networkd (or whichever sub-module)? That's a question.

Rather than asking whether isc-dhcp-server over SysV, or whatever was better.

------
flohofwoe
I agree with most of the critique, but I don't quite get the part about the
use of assert.

The code in question quite clearly doesn't use C's standard assert macro, but
a custom macro called assert_return (which I _guess_ checks for a condition,
and returns from the function instead of aborting the program).

So basically:

    
    
        if (!condition) {
            return err;
        }
    

How else would one check for invalid input to the function?

~~~
XorNot
I can see the argument to be made about overloading the word "assert" in the
language though.

i.e. "assert" meaning "check something in this code is algorithmically
correct"

Whereas when you're checking user input, use a very different term - i.e.
validate or something.

So a fix would be to rename the macro to be obvious about it's usage -
"validate_return" or "valid_or_return" or something.

To be fair I can see the problem because I do tend to write code comments
saying "assert everything makes sense" when checking user input sometimes and
"assert" itself doesn't properly convey the outcome if it fails - i.e.
"abort_if" or "crash_if" would be a much more descriptive term.

~~~
tetha
In my opinion diluting existing, well-established terms is extremely dangerous
and harmful for a project. If you're diluting established terms - like assert
- you're kinda forcing readers to stop assuming anything about your names.

If assert isn't assert, and if return values are also abused, why can I assume
get is get, sort is sort and so on? Sort might check inputs, get might delete
the customer account, who knows at this point. And that makes anything but a
tiny code base pretty much unmanageable, or at least very very hard to learn.

------
stinos
So this is a whole rant, and it's pretty much spot on probably, but the reader
is left with a huge elephant of a question in the room: what is the proper way
to do it, then? It has samples all over the place of what's bad and telling
thhose should be rewritten. Why not show some C/C++ code samples of the proper
way to do it, then?

~~~
asdfasgasdgasdg
> Why not show some C/C++ code samples of the proper way to do it, then?

He provides constructive examples for all of the items that he criticizes.

\- Instead of pointer arithmetic, base_ptr, offset, len.

\- Instead of casting to internal structs, parsing into internal structs.

\- Instead of network byte order, buf[offset] + buf[offset+1] * 256 .

\- Instead of the wrong error codes, the right ones.

\- Instead of random asserts over internal data structures, validation at
parse time.

~~~
clarry
> \- Instead of network byte order, buf[offset] + buf[offset+1] * 256 .

And you got it wrong. Shows that there is no silver bullet.

~~~
int_19h
The silver bullet is to use the standard library that provides appropriate
abstractions, such as byte streams with functions like "read a 32-bit little-
endian integer". Nobody should actually be writing code like the above in most
cases; and in the very very few cases where they have to (e.g. when forced to
program in an environment where these haven't been provided), they should
write it exactly once as a reusable function, complete with appropriate tests.

~~~
tetha
And from experience, I find all of those tests for functions like this stupid,
because the function is too trivial to write tests for.

And then I always get the first 1 - 3 implementations wrong in some weird way,
feel extremely stupid in the process and end up liking my tests.

------
mackal
Blog makes an error in their assumption of what assert_return() is ... It's
not a macro around assert() or related.

Possibly a bad idea on systemd's part to reuse the name assert, but the code
isn't doing what the blog assumes it does.

------
chris_wot
As an aside, I find it a little ironic that I found it hard to parse the
headline "Systemd is bad parsing". No, Systemd is not bad parsing, Systemd is
parsing badly.

~~~
hyperpape
The original is “Systemd is bad parsing and should feel bad”, which sounds a
little weird, but it’s a meme, so there are lower standards.

I guess someone felt like it shouldn’t be a meme headline on HN, but the
result is nonsense.

------
polotics
So it's true what some-of-you'all been saying all along about systemd!

;-D

One question: any smell of intentional obfuscation for this vulnerability, or
is it impossible to tell?

~~~
dsr_
Here's a thing to ponder:

If this were not buried in systemd, but was part of a standalone dhcpd
package, would it have been done better or caught earlier?

One of the major criticisms of systemd is that it wants to take over all
system management functions. In order to do that it needs to offer at least
minimally functional versions of all the services it pushes out. It should not
be news to anyone that trying to do everything instead of integrating with
existing projects leads to sloppiness.

This incident is evidence suggesting that other serious issues are lurking in
systemd as a result of the actions naturally arising from their policy.

~~~
toast0
I'm sure (but don't recall) Redhat's pump and ISC's DHCPCD have had CVEs; we
all know BINDs resolver and ntpd had CVEs.

If one is going to rewrite all of these things, I would expect one to learn
from the history and not make the same mistakes over again.

~~~
deialtrous
I would only expect that if the people rewriting everything were good
programmers and said they were rewriting it to be secure. Most rewrites are
done for non-security reasons are don't improve security.

~~~
mannykannot
Programmers should give appropriate attention to security regardless of
whether they are doing either original work or a rewrite, and in the latter
case, regardless of the reasons for the rewrite. I would not regard those who
do otherwise as good programmers. Clearly, a DHCPv6 client within Linux'
system administration suite is a case where security is an issue of the first
magnitude.

~~~
deialtrous
Programmers should do a lot of things that most of them do not do. The
statement was "I would expect..." and I replied that it doesn't make sense to
expect those things because the vast majority of programmers don't know or
care about security at all. We still see the same trivial SQL injections being
written again and again on today's web, often by 20+ year programming
"veterans".

------
eeZah7Ux
> In the late 1990s and early 2000s, we learned that parsing input is a
> problem.

What? It was known well in advance.

~~~
ben0x539
I suspect in advance of the 1990s it was a problem to people who specifically
cared about such things, and would read or write papers on the topic. In the
1990s it started to become a problem for a lot of people who really only cared
about putting their programs on a widely-accessible network. ;)

------
rphlx
For some further, semi-amusing support for the title's assertion, save all of
your work, close important programs, and then run 'sudo systemctl --dry-run
reboot'

------
zwkrt
Other people's gripes aside, I enjoy that the authors approach to avoiding
out-of-bounds errors almost exactly mirrors Java's nio Buffer interface.

------
stmw
Article declares that "What you should do instead is parse data the same way
as if you were writing code in JavaScript," followed by "pointer arithmetic is
wrong" (paraphrasing). That seems a bit extreme...

------
MichaelMoser123
gcc is running in -fno-permissive by default, it will give you a warning if
you implicitly try to converted between pointers of different types. (turn on
warning as errors -Werror so that you have to deal with these messages)

Now what I don't understand is that for C you are not allowed to disable this,
-fpermissive is not allowed for C, whereas for C++ you can. Anyone knows the
reason why? The compiler error says "command line option "-fpermissive" is
valid for C++/ObjC++ but not for C" \- why is the compiler having this
distinction?

------
exabrial
I typically disable ipv6 unless I specifically plan to offer/use it as a
public service. Just more things to keep an eye on for no benefit to the
actual business.

~~~
dvfjsdhgfv
Just like me and many other sysadmins. What amazes me is that the people who
designed IPv6 still don't get it and seem surprised why it hasn't took over
the Internet yet.

~~~
ben0x539
A lot of mobile phones access the internet via IPv6 by default, and IPv4-only
sites that want to reach them require non-trivial compatibility infrastructure
like carrier-grade NAT or more complicated shenanigans. It seems like some
sites are starting to identify these shims as a huge stumbling block in their
continuous work to improve latency/quality-of-service.

I think Facebook in particular made a bunch of noise in that direction and
celebrates how prioritizing IPv6 has let them make huge improvements in load
times etc, but at this point you're as likely to find good information about
that on Google as I am so I'll spare you any links.

So while IPv6 might not have taken over the Internet in any particularly
observable manner so far, I think it's fair to say it has huge momentum at
this point and has buy-in way beyond some IETF nerds and protocol fetishists.

~~~
dvfjsdhgfv
> It seems like some sites are starting to identify these shims as a huge
> stumbling block in their continuous work to improve latency/quality-of-
> service.

I'd be curious to know how much time you can actually save because really,
honestly, in the case of Facebook this particular aspect should be the last of
their concerns as far as page load times are concerned.

~~~
zlynx
Hundreds of milliseconds per TCP (or UDP) session. The CGNAT boxes are
handling hundreds of thousands of flows. And they're part of failover clusters
so each flow has to be shipped to the other cluster members and acknowledged
before starting the flow. And these things are often overloaded and adding
tens of milliseconds delay to each packet passing through. Because whoever
heard of an ISP spending more than "just enough" on hardware?

~~~
dvfjsdhgfv
You see, my point is that these milliseconds are negligible in the case of a
website like Facebook that takes 20 second to fully load on a reasonably
modern machine.

~~~
zlynx
You may already be behind an overloaded CGNAT then. Over IPv6 on Comcast
cable, with the Edge browser, Facebook is fully loaded for me in 3 seconds.
Surface tablet in power-save mode so this isn't some overpowered desktop
monster.

~~~
dvfjsdhgfv
That's very interesting. I've just measured it in private mode, all plugins
off, from the moment of clicking login till the time the page load indicator
stopped running. 13 full seconds.

~~~
zlynx
For fairness I just did that myself. 4 seconds.

------
verytrivial
This article started out really well, saying you shouldn't e.g. byte-swap as
part of the overall task at hand, then wanders into comparisons with other
languages. I had hoped it was instead going to say that you should use some
verified parsing tool to unpack data. All the other languages, at some level,
will byte swap! It is the choice of the level of abstraction (or lack thereof)
which is the systemd bug here, not the choice of a C family language IMNSHO.

------
IshKebab
This is the first reasonable criticism of systemd I've heard. That is very bad
code.

------
bvinc
He talks like the way he writes code has been established as best practice for
decades, and if he can just write enough blog posts and convince enough
people, maybe we can solve this problem once and for all.

I think the only way to have high assurances is to program under a system that
doesn't have these classes of errors. I don't care if it's a c static checker
or a new language or what.

Unfortunately the long term solution to these problems in system programming
seems to be rust. But it's going to take decades for it to be established and
I'm going to look like a language hipster member of the rust evangelism strike
force every time I bring it up.

But seriously, it's been decades and decades and we're still writing buffer
overflows. What is the long term final nail in the coffin for buffer
overflows?

~~~
someguydave
The long term solution is to burn down x86 and C and start over with sane
hardware, which has bounds checking _in hardware_.

~~~
chmod775
> bounds checking in hardware.

What. That's a horrible idea. Not only would that require your hardware to
understand abstractions several layers above what it actually operates on, it
would also mean it'd have to check bounds of - for example - an array on every
instruction accessing it. That is something compilers can do at more strategic
places.

Please keep that complexity out of my CPUs, they're buggy enough already. At
least with software we can fix issues later, which isn't a given with
hardware.

~~~
pjc50
Having the CPU do the checking within an instruction saves you instruction
issue events. ARM have an interesting solution to this, pointer
authentication:
[https://news.ycombinator.com/item?id=14479438](https://news.ycombinator.com/item?id=14479438)

> several layers above what it actually operates on

The instruction set is something of an abstraction bottleneck - C is tied to
machines that look something like a PDP-11, while a lot of the stuff that the
machine _is_ doing is invisible at the instruction set layer.

~~~
makomk
Pointer authentication is not a form of bounds checking. It solely exists to
prevent a malicious attacker using a bug to directly overwrite pointers (and
in practice, it's mostly limited to return addresses right now). Code which
does buggy pointer arithmetic will still create valid pointers to data outside
the intended buffer.

------
Go0the0gophers
Always fuzz your parser! (e.g. AFL, go fuzz)

------
xtrapolate
> "The other thing that you shouldn't do, even though C/C++ allows it, is
> pointer arithmetic. Again, it's one of those epiphany things C programmers
> remember from their early days. It's something they just couldn't grasp
> until one day they did, and then they fell in love with it. Except it's bad.
> The reason you struggled to grok it is because it's stupid and you shouldn't
> be using it. No other language has it, because it's bad."

Honestly, this is a pointless rant. "Do this, but don't do that (because
that's stupid)".

> "No other language has it, because it's bad"

The reason you "don't have pointer arithmetic in other languages" is - it's
been abstracted away from you, for your convenience, so that you can just
concentrate on your business logic and not have to worry about the rest. You
may not deal with pointer arithmetic, but the people developing your
JVM/CPython/V8/Chakra sure do. Author is simply ranting about the possibility
of abusing/misusing the mechanisms offered by C/C++.

~~~
zwkrt
Programming as a profession is strange because we each do such different
things but all tools are available to us all the time. Imagine if this were
true in the physical world; you would read arcticles written by accountants
like "Dynamite Considered Harmful" with rebuttals titled "Why Dynamite
Matters" by construction engineers...

~~~
xtrapolate
It's almost as if things aren't really black or white, there's usually plenty
of grey in-between. Things like use-cases, scenarios, trade-offs, intricacies.
I find it awkward that a security shop would publish an article using that
language - I would personally be embarrassed.

~~~
jolmg
> I would personally be embarrassed

The rest of the comment is good, it adds to the argument, and I even agree
with it, but there is no reason for this to be here. It doesn't add anything;
it's just like calling names. From the HN guidelines:

> Don't be snarky.

> When disagreeing, please reply to the argument instead of calling names.
> "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not
> 3."

EDIT: To be downvoted like so for reminding people of the guidelines, HN sure
is changing. I guess no good thing lasts forever.

------
a3_nm
If like me you are looking for the CVE and not for commentary about the
affected code, here it is:
[https://nvd.nist.gov/vuln/detail/CVE-2018-15688](https://nvd.nist.gov/vuln/detail/CVE-2018-15688)

FWIW, from [https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=912008](https://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=912008) it looks like the vulnerable code is not enabled
by default on Debian.

~~~
JdeBP
The rather less strongly titled "Systemd-networkd memory corruption" pointing
to the much more detailed Ubuntu bug has also been on Hacker News for a day.

* [https://news.ycombinator.com/item?id=18303613](https://news.ycombinator.com/item?id=18303613)

------
microcolonel
Some of these complaints are more "waah, I don't like C and what people use
pointers for!", which is a shame, because some are pretty reasonable concerns.

systemd _works_ , that's why people use it. If it needs improving, then
_improve it_ , or file these. Somebody will probably agree and at least look
into them, rather than a gaggle of strangers pointing and laughing.

------
filam
I have to admit I stopped reading after the combined section on byte swapping
and casting external data to a internal structure. Perhaps the author of the
code should have used ntohs() instead of be16toh(), but the effect is the
same? There aren't any conditionals in ntoh...

[https://commandcenter.blogspot.com/2012/04/byte-order-
fallac...](https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html)

And will a compiler make use of native byte swapping instructions with the
code proposed by the author?

~~~
ben0x539
The argument against byte swapping suggests that you shouldn't take a value of
a numeric type and apply some (maybe no-op) permutation to it to produce
another value of the numeric type. Instead, you should only ever think about
byte order at the step where you have an array (or stream, or...) of
individual bytes and are converting it to a value of a numeric type (larger
than uint8).

I think this is also the main point of the post you linked: the "bad" example
has a numeric value and then performs byte swapping on it, the "good" examples
only ever construct "correctly ordered" numeric values.

If you can possibly get to a place where you have a uint16 but the value is
"in the wrong byte order", you've done something wrong already.

Admitting the possibility of values with the wrong byte order into your system
casts every single value into doubt. Of course it's possible to resolve that
doubt on a case-by-case basis by carefully identifying the right places to
insert ntohs()/htons() calls, but to me that sounds like creating extra chores
for yourself while working against the language.

~~~
meowface
This philosophy is similar to Python 3's hard separation of bytes and strings.
Bytes are raw data you get off the wire and which are encoded in a certain
way. Once decoded into a string, you have a clean, consistent internal
representation that behaves as you expect a string would. If you're juggling
(or even thinking about) byte encodings past the parsing stage and before the
output stage, you've done something wrong.

~~~
ben0x539
Yeah, it generalizes to an argument for using types to encode assumptions
about data, so you have static certainty/remove dynamic uncertainty. I'm
strongly in favor of all arguments that let me remove sources of uncertainty
from my mental model, since enough stuff I'm uncertain about always remains.

