
Fuzz rising - masklinn
https://www.cloudatomiclab.com/fuzz/
======
pron
It's worth mentioning that there are _sound_ static analysis tools that
_guarantee_ no undefined behavior in C programs (e.g. [https://trust-in-
soft.com/](https://trust-in-soft.com/)). Much cheaper and more realistic than
a rewrite. Tenable or not, much of the software we rely on will remain written
in C in the next few decades.

~~~
0815test
> Much cheaper and more realistic than a rewrite.

I really have to dispute this, these tools are _not_ simple to use. You'll
probably have a far easier time rewriting the code piecemeal, i.e. function-
by-function (which will involve a _lot_ of unsafety initially, since you're
not using the idiomatic global patterns of a memory-safe language!) and then
refactoring into something idiomatic for the language you're using. Projects
like Firefox are basically taking a similar approach, only "rewriting" small
portions of the codebase that can be deployed immediately once rewritten.

Where similar tools might be useful is for preventing _logic_ -related issues
that are not encompassed under simple memory safety. Unfortunately it's still
not possible to endow, e.g. Rust code with automatically-checkable proofs,
showing that the logic in the code preserves some appropriate
conditions/invariants; while this is feasible, e.g. in Agda or Idris.
Hopefully by the time these concerns become pressing, practical memory-safe
languages will also offer this.

~~~
pron
When you want to verify functional properties these tools are not easy to use.
For safety, they are easier than a rewrite and largely automatic. Large,
sensitive codebases have been verified (for undefined behavior, not functional
correctness) relatively quickly. In general, they cover more ground
significantly more quickly and cheaply than a rewrite.

------
rlpb
> Cut down your dependencies on Linux distributions.

I think the author has this backwards. Distributions don't play language
favourites[1], and distributions like Debian celebrate the packaging of
alternatives to give users choice. The author is right in observing that most
C code is shipped by distributions and not elsewhere for packaging reasons,
but then later inverts the causal relation.

If good replacements for components [that are currently] written in C appear,
then distributions will package them, and the author's problem will go away.

[1] Except that distributions dislike toolchains that embed libraries into
builds instead of dynamically linking them, since then they can't issue a
security update for a library just by updating that library; a world rebuild
is required. This isn't a language or safety issue though, and could easily be
changed without changing the language. For example, gcc's golang
implementation supports shared libraries now.

~~~
pjmlp
Probably someone can post a link to this, as it doesn't seem to exist anymore.

The GNU project used to have a manifesto page from the early days stating that
C should be the preferred language to deliver GNU software, except in cases
where it did not made sense like libraries for other languages, followed with
a list of endorsed alternative languages.

~~~
rlpb
Perhaps that's because it made sense back then, but makes less sense now? :)

~~~
pjmlp
It only made sense in the context of cloning UNIX software and its relation to
C.

Regarding safety it never made sense. Morris worm is now around 30 years old
and lint was designed in 1979, with the expectation that every C developer
would actually use it.

~~~
rlpb
> Regarding safety it never made sense.

What suitable and sufficiently performant languages existed back then?

~~~
pjmlp
First of all, contrary to urban myths, C arrived 10 years later to the scene
of systems programming languages. It wasn't the genesis of them.

There were already computers being developed in high level languages since
1961, most well known languages are ESPOL, NEWP, Algol subsets, PL/I and its
variants (PL/S, PL/M, PL/S, PL.8,...), Mesa, Modula-2. And for those with deep
pockets on the early 80's, Ada.

As for performance, C was a lousy language regarding optimising compilers, it
was only due to UB and lots of hard work that it actually arrived where it is
today.

"Oh, it was quite a while ago. I kind of stopped when C came out. That was a
big blow. We were making so much good progress on optimizations and
transformations. We were getting rid of just one nice problem after another.
When C came out, at one of the SIGPLAN compiler conferences, there was a
debate between Steve Johnson from Bell Labs, who was supporting C, and one of
our people, Bill Harrison, who was working on a project that I had at that
time supporting automatic optimization...The nubbin of the debate was Steve's
defense of not having to build optimizers anymore because the programmer would
take care of it. That it was really a programmer's issue.... Seibel: Do you
think C is a reasonable language if they had restricted its use to operating-
system kernels? Allen: Oh, yeah. That would have been fine. And, in fact, you
need to have something like that, something where experts can really fine-tune
without big bottlenecks because those are key problems to solve. By 1960, we
had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60.
These are higher-level than C. We have seriously regressed, since C developed.
C has destroyed our ability to advance the state of the art in automatic
optimization, automatic parallelization, automatic mapping of a high-level
language to the machine. This is one of the reasons compilers are ...
basically not taught much anymore in the colleges and universities."

\-- Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work:
Reflections on the Craft of Programming

Interesting fact regarding security, MIT continued with Multics, even after
Bell Labs dropping out.

Guess what, DoD later assigned a better security level than UNIX, thanks to
PL/I being the systems language.

[https://multicians.org/b2.html](https://multicians.org/b2.html)

------
dinglejungle
The title of this submission ("Fuzzing makes memory-unsafe languages
untenable") makes no sense[1] and is not found anywhere in the linked article.

[1] perhaps the intended meaning of the title was "fuzzing shows that memory-
unsafe languages are untenable", but that's certainly not the meaning of the
current title

~~~
masklinn
It's the tagline the author used on Twitter:
[https://twitter.com/justincormack/status/1153060402495991808](https://twitter.com/justincormack/status/1153060402495991808)
and seemed clearer than the actual title.

> perhaps the intended meaning of the title was "fuzzing shows that memory-
> unsafe languages are untenable", but that's certainly not the meaning of the
> current title

No, fuzzing _makes_ these languages untenable because it provides a tool for
automating memory unsafety issues. Without mature fuzzing tools, most of these
issues can remain unfound, but fuzzing surfaces them — and their potential for
exploitation — rather easily.

It's a bit of a "security by obscurity" thing, but I think there's a point to
this view: fuzzing takes the existing crack / fault of memory unsafety[0] and
blasts it open so wide you can get a truck through.

~~~
nullc
> fuzzing makes these languages untenable because it provides a tool for
> automating memory unsafety issues. Sans mature fuzzing tools, most of these
> issues can remain unfound,

<Pointy-haired Boss>Fuzzing is now forbidden in our offices. Next
problem?</phb>

~~~
snaky
Jokes aside, Pointy-haired Boss ask why should we care about security issues
from the business standpoint. Do we know any company went out of business due
the security breach?

~~~
jusob
Yes, last month for example: [https://healthitsecurity.com/news/amca-files-
chapter-11-afte...](https://healthitsecurity.com/news/amca-files-
chapter-11-after-data-breach-impacting-quest-labcorp)

------
phoe-krk
> Linux distros are the de facto package manager for C code, and C++ to a
> lesser extent

It's not a conclusion I'd end up with, but after a moment of thought, I agree
with it.

------
d33
Is it just me or this pie chart is extraordinarily terrible because the legend
is not in order and there's just too many similar colors? And I'm not even
colorblind.

~~~
tinus_hn
It’s terrible because there are just too many colors. There is no focus so the
graph explains nothing.

~~~
titzer
This would be easily fixable.

1\. Sort quantities in both the list and the chart by descending proportion.
2\. Assign colors systematically, e.g a spectrum from red to purple, or shades
of primary colors (bonus if you pick better colors for certain types of color
blindness).

------
jplayer01
> Look to smaller more nimble Linux distributions that start shipping memory
> safe code

Are any distributions moving in this direction yet? Seems to me like C/C++
applications are so foundational to Linux that this would be a huge
undertaking. Are there even any memory safe window managers? Hell, even then
you're reliant on X or Wayland. PulseAudio/Alsa are also hard to replace.

edit: Quick google search only gives me [https://github.com/mesalock-
linux/mesalock-distro](https://github.com/mesalock-linux/mesalock-distro)
which is trying to replace common tools with Rust/Go counterparts. I'm not
sure how active it is. The main repo only has one main contributor.

~~~
masklinn
> Are there even any memory safe window managers?

I assume "non-toy" is included in there, but there's xmonad at least.

~~~
IceDane
5+ years of using xmonad and never had a crash that I can remember.

~~~
owl57
Do other window managers crash?

~~~
Filligree
Yes, often. I can't count the number of times KWin has crashed on me. You
might not notice, because it automatically restarts -- usually.

------
blodovnik
So what are the alternatives for system programming?

Rust, Golang? Any other true contenders?

~~~
masklinn
It really depends what you mean by "system programming".

If you mean things like foundational libraries & network stacks, then go is
not a contender[0]. If you mean system daemons and the like, then languages
like ocaml, D, … should also work (and that's assuming you want / need the
performance of a native binary, if you don't then the world's your oyster).

[0] to my understanding — and I may be completely mistaken here — memory-safe
ADA with dynamic allocation and without GC is pretty much an active research
field so fails either "memory safe" or "suitable for going fast"

~~~
titzer
> then go is not a contender[0] > [0] ... memory-safe ADA with dynamic
> allocation and without GC is pretty much an active research field so fails
> either "memory safe" or "suitable for going fast"

Why would D with a conservative collector be "more efficient"? How does being
conservative (with stack roots, presumably) fundamentally make a GC better?

Why would OCaml be better than Go? OCaml is garbage-collected, no opting out.

~~~
masklinn
> Why would D with a conservative collector be "more efficient"? How does
> being conservative (with stack roots, presumably) fundamentally make a GC
> better?

It wouldn't? The footnote was for Ada's applicability in the context of
"foundational libraries & network stacks" (as it's often advertised as a very
safe yet low-level language), because in my understanding (and again I could
be wrong here) it's either GC'd and memory-safe or neither, so same as e.g. D,
making it unsuitable for that layer.

> Why would OCaml be better than Go? OCaml is garbage-collected, no opting
> out.

It wouldn't either? The "also" is the second clause indicates that Go _would_
be suitable for "systems daemons and the like", and so would pretty much any
other memory-safe language, possibly restricted to the more efficient ones
(non-interpreted / JITed) depending on the specific use-case.

~~~
pjmlp
Real Time Java on bare metal, Oberon on bare metal (Astrobe), System C#,
Modula-3,...

------
praptak
Maybe this will fuel some much needed advances in static analysis, although
the most popular memory-unsafe languages are also very hard to model formally.

------
tinktank
I'm getting really tired of these clickbait headlines. Nowhere in this article
does it assert the title

~~~
tinktank
I just read it again, and this comes off as another indirect push for
Mirage/Unikernels. I hope it isn't and I'm just being cynical.

~~~
justincormack
(author here, ex-C programmer) its not. Its just against C, and the dependence
we have built on it, and the huge number of issues we are seeing now. I don't
care what memory safe language you use. A safe userspace for Linux is
achievable without as much change as unikernels.

------
pjmlp
> Most of the C and C++ code that causes the majority of open source CVEs is
> shipped in Linux distributions.

On what is usually a very strict patch review process, with kernel sanitizers
and yet....

~~~
masklinn
AFAIK most of the CVEs are not for the kernel itself (and the article doesn't
say anything about replacing or rewriting the kernel itself), it's for
everything that's built upon it. That's also a point Brian Cantrill makes in
one of his talks:

> the safety argument just doesn't carry as much weight for kernel developers,
> not because the safety argument isn't really, really important. It's just
> because it's safe, because when it's not safe, it blows up, and everyone
> gets really upset. We figure out why. We fix it. And we develop a lot of
> great tooling to not have these problems.

~~~
pjmlp
Google has a talk where they mention 68% of them are.

That is why they started the Kernel Self Preservation Project and have been
sponsoring cleaning the kernel from C bad practices like VLAs.

~~~
masklinn
Interesting, do you have the link?

~~~
pjmlp
Yep, here is the playlist.

[https://www.youtube.com/playlist?list=PLbzoR-
pLrL6rOT6m50HdJ...](https://www.youtube.com/playlist?list=PLbzoR-
pLrL6rOT6m50HdJFYUHyvA9lurI)

Check all the talks from Google.

