
C Is Not a Low-Level Language - jodooshi
https://queue.acm.org/detail.cfm?id=3212479
======
mattnewport
This article makes some valid points but is overall rather misleading I think.
Almost all of the reasons given why C is "not a low-level language" also apply
to x86/x64 assembly. Register renaming, cache hierarchies, out of order and
speculative execution etc are not visible at the assembly / machine code level
either on Intel or other mainstream CPU architectures like ARM or Power PC. If
C is not a low level language then a low level language does not exist for
modern CPUs and since all other languages ultimately compile down to the same
instruction sets they all suffer from some of the same limitations.

It's really backwards compatibility of instruction sets / architectures that
imposes most of these limitations. Processors that get around them to some
degree like GPUs do so by abandoning some amount of backwards compatibility
and/or general purpose functionality and that is in part why they haven't
displaced general purpose CPUs for general purpose use.

~~~
ge0rg
I also had the initial impression that the article is misleading, but later on
the author made the point that the C compiler is doing significant work to
reorder / parallelize / optimize the code. I agree that x86/x64 is not a low-
level language either, but even if it was, with the description the author
provided, I'd agree with his point of C not being low-level.

Regarding cutting off backwards compatibility to improve the design, Intel's
Itanium (affectionately called "Itanic") was a very progressive approach to
shift the optimization work from the CPU (and the compiler) to just the
compiler. I'm not sure what the reasons for its failing were, though.

~~~
skywhopper
There was an article on Hacker News recently that covered some of the reasons
for Itanium's failure to realize its theoretical benefits. I'm not finding it
now, but IIRC, the argument made was that predicting likely-parallelizable
code is actually a lot harder to do at compile time, and that, like so many
ultra-optimized systems, the real world works much differently and a messier,
more random approach ultimately yields far better performance.

~~~
drbawb
>the argument made was that predicting likely-parallelizable code is actually
a lot harder to do at compile time

So don't do it at compile time? That's really a very weak argument against the
Itanium ISA, and honestly more of an argument against the AOT complication
model. Take a runtime with a great JIT, like the JVM or V8, and teach it to
emit instructions for the Itanium ISA. (As an added advantage these runtimes
are extremely portable and can be run, with less optimizations, on other ISAs
without issue.)

The problem, as always, is that nobody with money to spend ever wants to part
with their existing software. (Likely written in C.) In 2001 Clang/LLVM didn't
even exist, and I'm not familiar with any C compilers of the era that had so
much as a rudimentary JIT.

~~~
mattnewport
There's not that much overlap between the kind of optimizations that JITs do
and the optimizations that modern CPUs do. The promise of JITs outperforming
AoT compiled code has never really materialized. The performance advantages of
OoO execution, speculative execution, etc. are very real and all modern high
performance CPUs do them. Attempts to shift some of that work onto the
compiler like Itanium and Cell have largely been failures.

~~~
dnautics
arguably the "sufficiently advanced compiler" (cue joke) has arrived (sadly,
post Itanium, Cell) in the form of a popularized LLVM[0], so it's improper to
claim failure based on two, aged datapoints.

The flaws of OOO and SpecEx are evident with the overhead required to secure a
system (spectre, meltdown) in a nondeterministic computational environment,
and there is certainly a _power_ cost to effectively JITting your code on
every clock cycle.

As the definition of performance is changing due to the topping out of moore's
law and shifting paralellism from amdahl to gustafson, I think there is a real
opportunity for non ooo, non specex in th future.

~~~
mattnewport
OoO and speculative execution are largely improving performance based on
dynamic context that in most real world cases is not available at compile
time. They are able to do so much more efficiently than software JITting can
due to being implemented in hardware. There is still no sufficiently advanced
compiler to make getting rid of them a good strategy for many workloads.

Most of what OoO and speculative execution are doing for performance on modern
CPUs is hiding L2 and L3 cache latency. On a modern system running common
workloads it's pretty unpredictable when you're going to miss L1 as it's
dependent on complex dynamic factors. Cell tried replacing automatically
managed caches with explicitly managed on chip memory and that proved very
difficult to work with for many problems. There's been little investment in
technologies to better use software managed caches since then because no other
significant CPU design has tried it. It's not a problem LLVM attempts to
address to my knowledge.

Other perf problems are fundamental to the way we structure code. C++
performance advantages come in part from very aggressive inlining but OoO is
important when inlining is not practical which is still a lot of the time.

~~~
dnautics
My point is that the dominant software programming paradigm is migrating away
from highly dynamic to highly regular. A good example is Machine learning,
where for any given pipeline, your matrix sizes are generally going to stay
the same. A good compiler can distribute the computation quite well without
much trouble, and this code will almost certainly not need SpecEx/OOO (which
is why we put them on GPUs and TPUs). Or imagine a billion tiny cores each
running a fairly regularly-shaped lambda.

Sure some things like nginx gateways and basic REST routers will have to
handle highly dynamic demands with shared tenancy, but the trends seem to me
to be away from that. As you say, this is all dependent on the structure of
code; and I think our code is moving towards one where the perf advantages
won't depend on OoO and specex for many more cases than now.

~~~
mattnewport
This might be true for some domains but it's far from true for the performance
sensitive domains I'm familiar with - games / VR / realtime rendering. The
trend is if anything the opposite there as expectations around scene and
simulation complexity are ever increasing.

------
munificent
I really really liked this article, and reading the comments here is blowing
my mind. Did we read the same thing?

I think it's a strong insight that insight that chip designers and compiler
vendors have spent person-millenia maintaining the illusion that we are
targeting a PDP-11-like platform even while the platform has grown less and
less like that. And, it turns out, with things like Spectre and the
performance cost of cache misses, that abstraction layer is quite leaky in
potentially disastrous ways.

But, at the same time, they have done such a good job of maintaining that
illusion that we forget it isn't actually reality.

I like the title of the article because many programmers today _do_ still
think C is a close mapping to how chips work. If you happen to be one of the
enlightening minority who know that hasn't been true for a while, that's
great, but I don't think it's good to criticize the title based on that.

~~~
skybrian
It seems like the article is mostly useful for inspiring research; that is,
most of us aren't the target audience.

I'm wondering what will happen as GPU's become more general-purpose. What's
next after machine learning?

Would it be possible to make a machine where all code runs on a GPU? How would
GPU's have to change to make that possible, and would it result in losing what
makes them so useful? What would the OS and programming language look like?

~~~
antris
_> It seems like the article is mostly useful for inspiring research; that is,
most of us aren't the target audience._

As a group of professionals, it is highly beneficial for us to be interested
in these things. People who design languages and compilers do it largely on
what is perceived as being demanded, and _us as programmers are the ones that
create the demand for new languages_.

To put in other words, if programmers aren't aware of what's going wrong with
our current languages, they cannot express their need for new languages. So,
there's less incentive for researchers to produce new ways of programming
computers. It is much more tempting to "please the masses" in a way that
causes this local-maximum problem. It's much more interesting to research
problems that translate into mainstream use than academic things that nobody
actually uses.

~~~
skybrian
I still think a research project (either academic or in industry) with a
hardware component would be the best way to explore radical new processor
architectures that are further away from C.

\- Hardware designers are conservative. They aren't likely to implement a
different hardware architecture because "programmers demand it" (really?),
unless there's existing research showing how it can be done and a compelling
reason why customers will buy the chips.

\- As a hobbyist language designer, I'm still going to target something that
exists and is popular: x86, JavaScript, wasm, C, or something like that. A
low-level language targeting a platform that doesn't exist isn't all that
appealing. But, someone might get some good papers out of doing the research.

------
United857
It's worth noting that chips that were designed for high-performance computing
(e.g. the Cell) from the outset generally _don 't_ have silicon devoted to
things like out of order execution, register renaming, etc. In this case, the
bulk of the optimization logic _does_ shift to the programmer (aided by the
compiler).

The reason is that in these domains (e.g. game consoles, supercomputing), you
know ahead of time the precise hardware characteristics of your target, you
can assume it won't change, and can thus optimize specifically for that ahead
of time.

This isn't true for "mass-market" software that needs to run across multiple
devices, with many variants of a given architecture.

~~~
mattnewport
> The reason is that in these domains (e.g. game consoles, supercomputing),
> you know ahead of time the precise hardware characteristics of your target,
> you can assume it won't change, and can thus optimize specifically for that
> ahead of time.

Cell was a failure in large part because this proved to be less true / less
relevant than its designers thought.

Source: many late nights / weekends trying to get PS3 launch titles performing
well enough to ship.

~~~
scott_s
I did work in graduate school trying to make the Cell easier to program -
basically, providing OpenMP-like abstractions that would take advantage of the
SPEs. I've always been really curious: how much did your games take advantage
of the SPEs? When did you send code to the SPEs versus using the GPU? Were you
using libraries that helped managing the SPEs, or did you do all of it
manually?

~~~
mattnewport
OpenMP is a bad approach for the types of problems commonly encountered in
games and graphics programming in my experience. Matt Pharr's excellent series
of articles on the history of ISPC gives some good explanations of what
programming models actually work well for graphics particularly:
[http://pharr.org/matt/blog/2018/04/30/ispc-
all.html](http://pharr.org/matt/blog/2018/04/30/ispc-all.html)

At the time I was doing most of my SPE work (helping to optimize launch titles
at EA prior to the launch of the PS3) most titles weren't taking much
advantage of them at all. We were a central team helping move some code that
seemed like it would most benefit over, I was particularly involved in moving
animation code to the SPEs. There weren't really any options for libraries to
help at that point, other than things we were building internally, so it was
almost all manual work.

Later on in the PS3 lifecycle people moved more and more code to the SPEs. To
my knowledge most of that work was largely manual still. For a while I was
project lead on EA's internal job/task management library which had had a big
focus on supporting use of the SPEs but my involvement in it was mostly during
the early part of the Xbox One / PS4 generation. The Frostbite graphics team
in particular did a lot of interesting work shifting GPU work over to the SPEs
(I think some of it they've talked about publicly) but I wasn't directly
involved in that.

~~~
scott_s
I completely believe you on OpenMP being bad for games and graphics
programming; we were targeting the HPC community which had a heavy interest in
Cell as well. But all the while, I knew a bunch of programmers out in the
world were shipping Cell code, and I was always curious what their patterns
were. Thanks for the answers!

------
umanwizard
The points made in the article are certainly valid, but C is low-level in an
abstract sense: it is approximately the intersection of all mainstream
languages.

I.e. if a feature exists in C, it probably exists in every language most
programmers are familiar with. (I worded this statement carefully to exclude
exotic languages like Haskell or Erlang).

Thus C, while not low-level relative to actual hardware, is low-level relative
to _programmers ' mental model of programming_. If this is what we mean, it's
still true and useful to think of C as a low-level language.

That said, it's important to keep the distinction in mind -- statements like
"C maps to machine operations in a straightforward way" have been
categorically wrong for decades.

~~~
typomatic
> I worded this statement carefully to exclude exotic languages like Haskell
> or Erlang

I suspect that your definition of "exotic" is exactly "not like C".

~~~
mda
Which is kinda true. Most popular languages are C like.

~~~
pjmlp
30 years ago the landscape looked quite different.

------
Rebelgecko
Going by their definition, I don't think there are any low level languages, at
least on modern architectures. Even x86 assembly abstracts out a lot of what
is going on within the CPU.

~~~
umanwizard
That doesn't mean the definition is useless -- rather than "C isn't a low-
level language, as opposed to something else which is", the point might be
"there exist no low-level languages according to most people's understanding
of that term". Which is still an interesting and useful fact.

~~~
rbanffy
It also hides the fact C is just a couple notches above the absolute minimum
most people would even consider - writing assembly code by hand - and is,
effectively, the lowest most programmers will ever venture.

~~~
fixermark
True, but one of the points the article makes is that in practice, there's a
vast gulf of distance (person-years of C compiler development) between the C
code one writes and the resulting assembly code output (and this is ignoring
the fact that x86 assembly is, itself a co-evolved abstraction with C-like
languages that is basically emulated on modern massively-parallel CPU
architectures).

In that regard, a case can be made that when you're writing in C, you're
writing exactly as close to the bare metal as if you're writing in, say, Go or
Haskell.

~~~
simen
> In that regard, a case can be made that when you're writing in C, you're
> writing exactly as close to the bare metal as if you're writing in, say, Go
> or Haskell.

No, you really can't. This is childish black and white thinking. The
computational model of C is built on an interface exposed by the hardware. Go
and Haskell build many additional abstractions _on top of_ that same model.

This article could have had a fruitful discussion about what the author is
trying to say, but by choosing such a clickbait title, he managed to turn it
into a discussion on semantics that wants to deny useful distinctions, because
in some context (not the context in which it's actually used), it doesn't fit.

This kind of linguistic wankery really pisses me off, because it's useless and
rests on a misunderstanding of how people actually use language (which is to
say, _in context_ and often _in relative terms_ ).

~~~
fixermark
But I believe that's the article's very point: the context and relative terms
people often talk about C are incorrect. The amount of mutation delta between
a C program and the corresponding assembly instructions is significant, but
people continue to believe it is not, which results in all sorts of incorrect
assumptions when reasoning about a C program (such as which line of code or
statement executes "first").

Haskell, Go, et. al. are understood to have complex runtime machinery atop the
x86 instruction set. It's an erroneous belief that C does not (and one that
I've seen developers get bitten by repeatedly as they try to manage threaded C
code).

------
ChuckMcM
I enjoyed reading this, mostly because it made me angry, then curious, then
thoughtful all in one go.

Partly because I really like the PDP-11 architecture, and it's 'separated at
birth' twin the 68K, it greatly influenced me in how I think about
computation. I also believe that one of the reasons that the ATMega series of
8 bit micros were so popular was that they were more amenable to a C code
generator than either the 8051 or PIC architectures were.

That said, computer languages are similar to spoken languages in that a
concept you want to convey can be made more easily or less easily understood
by the target by the nature of the vocabulary and structure available to you.

Many useful systems abstractions, queues, processes, memory maps, and
schedulers are pretty easy to express in C, complex string manipulation, not
so much.

What has endeared C to its early users was that it was a 'low constraint'
language, much like perl, it historically has had a fairly loose policy about
rules in order to allow for a wider variety of expression. I don't know if
that makes it 'low' but it certainly helped it be versatile.

------
dahart
> A processor designed purely for speed, not for a compromise between speed
> and C support, would likely support large numbers of threads, have wide
> vector units, and have a much simpler memory model.

Sounds like a GPU?

> Running C code on such a system would be problematic, so, given the large
> amount of legacy C code in the world, it would not likely be a commercial
> success.

It seems like ATI & NVIDIA are doing okay, even with C & C++ kernels. GLSL and
HLSL are both C-like. What is problematic?

~~~
tsomctl
C-like code that runs on GPUs is not even close to normal C, even though the
syntax is similar. The way you layout your memory, schedule your threads, and
add memory barriers is completely different. You are never going to take a
piece of large C code written for a CPU and just run it directly on a GPU.

~~~
dahart
Huh, that’s weird, I run a C++ compiler directly on my GPU code. The only
difference between CPU and GPU code at the function level is whether I tag it
with a __global__ macro or not, and lots of functions compile and run for both
CPU and GPU.

Memory layout, thread scheduling, and barriers are not features of the C
language and have nothing to do with whether your C is “normal”. Those are
part of the programming model of the device you’re using, and apply to all
languages on that device. Normal C on an Arduino looks different than normal C
on an Intel CPU which looks different than normal C on an NVIDIA GeForce.

~~~
tsomctl
OK, I guess it comes down to what you call "normal" C. I was defining it as
what would run on x86 Windows or Linux.

~~~
my123
You can look at C++ AMP too, it runs with all GPUs that support DX11 on
Windows, and is a part of the Windows SDK. It's implemented by AMD ROCm on
Linux, which also implements HIP/CUDA. Normal C/C++ can run fine on modern GPU
architectures.

~~~
pjmlp
NVidia designed their latest GPU architecture to run C++.

------
ovao
To me the argument's akin to suggesting that Robert Wadlow wasn't tall,
because giraffes are taller than Robert Wadlow.

When the spectrum of the context is unambiguous, that's not an argument for
finding a way to make it ambiguous.

~~~
Sean1708
I think that would be a fair point if the article was about whether or not we
should call C a low-level language, but the article is actually about whether
C maps cleanly onto what the machine actually does and what a machine might
look like if we didn't have that expectation.

------
cryptonector
> The root cause of the Spectre and Meltdown vulnerabilities was that
> processor architects were trying to build not just fast processors, but fast
> processors that expose the same abstract machine as a PDP-11. [...]

This strikes me as a flavor of the VLIW+compilers-could-statically-do-more-of-
the-work argument, though TFA does not mention VLIW architectures.

C or not, making compilers do more of the work is not trivial, it is not even
simple, not even hard -- it's insanely difficult, at least for VLIW
architectures, and it's insanely difficult whether we're using C or, say,
Haskell. The only concession to make is that a Haskell compiler would have a
lot more freedom than a C compiler, and a much more integrated view of the
code to generate, but still, it'd be insanely hard to do all of the scheduling
in the compiler. Moreover, the moment you share a CPU and its caches is the
moment that static scheduling no longer works, and there is a lot of economic
pressure to share resources.

There are reasons that this make-the-compilers-insanely-smart approach has
failed.

It might be more likely to be successful _now_ than 15 years ago, and it might
be more successful if applied to Rust or Haskell or some such than C, but,
honestly?, I just don't believe this will work anytime soon, and it's all
academic anyways as long as the CPU architects keep churning out CPUs with
hidden caches and speculative execution.

If you want this to be feasible, the first step is to make a CPU where you can
turn off speculative execution and where there is no sharing between hardware
threads. This could be an extension of existing CPUs.

A much more interesting approach might be to build asynchrony right into the
CPUs and their ISAs. Suppose LOADs and STOREs were asynchronous, with an
AWAIT-type instruction by which to implement micro event loops... then
compilers could effectively do CPS conversion and automatically make your code
locally async. This _is_ feasible because CPS conversion is well-understood,
but this is a far cry from the VLIW approach. Indeed, this is a lot simpler
than the VLIW approach.

TFA mentions CMT and ULtraSPARC, and that's certainly a design direction, but
note that it's one that makes C less of a problem anyways -- so maybe C isn't
the problem...

Still, IMO TFA is right that C is a large part of the problem. Evented
programs and libraries written in languages that insist on immutable data
structures would help a great deal. Sharing even less across HW/SW threads
(not even immutable data) would still be needed in order to eliminate the need
for cache coherency, but just having immutable data would help reduce cache
snooping overhead in actual programs. But the CPUs will continue to be von
Neuman designs at heart.

------
kev009
The meta point from the article is that this is as much a hardware problem as
it is a language or developer one. An arms race was waged to create CPUs that
are very effective in running sequential programs; to the point that what they
present to the program is a very much a facade and they hide an increasing
great deal of internal implementation detail. By David's postulation, even the
native assembly language for the CPU is not low level.

To drive this juxtaposition home, I'd point to PALcode on Alpha processors in
which C (and others) can very much be a low level language. Very few
commercial processors let you code at the microcode level.

The overarching premise is then brought home by GPU programming, which shows
that you don't necessarily need to be writing at the ucode level if the
ecosystem was built around how the modern hardware functioned.

------
scott_s
The author, David Chisnall, is a co-author on a related paper from PLDI 2016:
"Into the Depths of C: Elaborating the De Facto Standards",
[https://news.ycombinator.com/item?id=11805377](https://news.ycombinator.com/item?id=11805377)

~~~
favorited
He was also one of the earliest non-Apple contributors to Clang, was on the
FreeBSD core team, and wrote the modern GNU Objective-C runtime
implementation. His work on Objective-C in particular is prolific.

~~~
sizeofchar
Also, his book on Objective-C is the best one I read.

------
compiler-guy
There is an entire junkyard full of processors designed to run other languages
well.

LISP machines in the 60s, Java machines in the 90s, many others.

For whatever reason, successful general purpose silicon has almost always
followed a C-ish model.

It's also worth noting that Fortran runs quite well on C-ish style processors.

~~~
gpderetta
Exactly. While CPU designers c will certainly make sure they can run C code
fast, it turns out that, for the last 40 years at least, the C model
(sequential, procedural, mostly flat address space) is the most efficient to
implement in hardware.

------
davidw
"C combines the power and performance of assembly language with the
flexibility and ease-of-use of assembly language."

------
salgernon
It feels like the author really isn't talking so much about the limitations of
C on modern architectures, but the architecture itself.

Possibly relevant is this (short?) discussion[1] from 2011 about a CPU more
closely designed for functional programming.

[1]
[https://news.ycombinator.com/item?id=2645423](https://news.ycombinator.com/item?id=2645423)

------
angry_octet
It is instructive to consider GPUs and their compilers. The death of OpenGL in
favour of Vulcan has come about because OpenGL is unable to express low level
constructs which are essential to achieving performance. GPU drivers are
actually compilers that recompile shaders to efficient machine expressions.

Thus the fundamental limitation is that the processor has only a C ABI. If
there were a vectorisation and parallel friendly ABI, then it would be
possible to write high level language compilers for that. It should be
possible for such an ABI to coexist with the traditional ASM/C ABI, with a
mode switch for different processes.

~~~
angry_octet
s/vulcan/vulkan damn autocorrect.

------
arghwhat
It is correct that C is not _really_ a low level language, but the points
about how C limits the processor doesn't make much sense.

It uses UltraSPARC T1 and above processors as an example for a "better"
processor "not made for C", but this argument makes no sense at all. The
"unique" approach in the UltraSPARC T1 was to aim for many simple cores rather
than few large cores.

This is simply about prioritizing silicon. Huge cores, many cores,
small/cheap/simple/efficient die. Pick two. I'm sure Sun would have _loved_ to
cram huge caches in there, as it would benefit everything, but budgets,
deadlines and target prices must be met.

Furthermore, the UltraSPARC T1 was designed to support existing C and Java
applications (this was Sun, remember?), despite the claim that this was a
processor "not designed for traditional C".

There are very few hardware features that one can add to a conventional CPU
(which even includes things like the Mill architecture) that would not benefit
C as well, and I cannot possibly imagine a feature that would benefit other
languages that would be _harmful_ to C. The example of loop count inference
for use of ARM SVE being hard in C is particularly bad It is certainly no
harder in the common use of a for loop than it is to deduce the length of an
array on which a map function is applied.

I cannot imagine a single compromise done on a CPU as a result of conventional
programming/C. That is, short of replacing the CPU with an entirely different
device type, such as a GPU or FPGA.

~~~
dgreensp
The point is specifically about parallel vs sequential programs. Legacy C code
is sequential, and the C model makes parallel programming very difficult.

I met a guy back in college, a PhD who went to work at Intel, who told me the
same thing. In theory, the future of general purpose computing was tons of
small cores. In practice, Intel's customers just wanted existing C code to
keep running exponentially faster.

~~~
arghwhat
> Legacy C code is sequential, and the C model makes parallel programming very
> difficult.

Neither of these statements are true, unless "Legacy" refers to the early days
of UNIX.

Tasks that parallelize poorly do not benefit of many small cores. This is
usually a result of either dealing with a problem that does not parallelize,
or just an _implementation_ that does not parallelize (because of a poor
design). Neither of these attributes are related to language choice.

An example of something that does not parallelize _at all_ would be an
AES256-CBC implementation. It doesn't matter what your tool is: Erlang,
Haskell, Go, Rust, even VHDL. It cannot be parallelized or pipelined. INFLATE
has a similar issue.

For such algorithms, the only way to increase throughput is to increase
single-threaded performance. Increasing cores increase total capacity, but
cannot increase throughput. For other tasks, synchronization costs of
parallelization is too high. I work for a high performance network equipment
manufacturer (100Gb/s+), and we are certainly limited by sequential
performance. We have _custom hardware_ in order to load balance data to
different CPU sockets, as software based load distribution would be several
orders of magnitude too slow. The CPU's just can't access memory fast enough,
and many slower cores wouldn't help as they'd both be slower, _and_ incur
overheads.

Go and Erlang of course provide built-in language support for easy
parallelism, while in C you need to pull in pthreads or a CSP library
yourself, but the C model doesn't make parallel programming "very difficult",
nor is C any more sequential by nature than Rust. It is also incorrect to
assume that you can parallelize your way to performance. In reality, the "tons
of small cores" is mostly just good at increasing total capacity, not
throughput.

~~~
dgreensp
I admit it's not fair to blame C in particular. The comparison is between how
we write and execute software and how we could write and execute software, and
the language absolutely comes into play, in addition to how the language is
conventionally used. "Legacy" code in this context is code that was written in
the past and is not going to be updated or rewritten.

I disagree that tasks performed by a computer either don't parallelize or the
cost of synchronization is too high. At a fine-grained level, our compilers
vectorize (i.e. parallelize) our code -- with limits imposed by C's "fake low-
levelness" as described in the article -- and then our processors exploit all
the parallelism they can find in the instructions. At a coarser level, even if
calculating a SHA (say) isn't parallelizable, running a git command computes
many SHAs. The reasons why independent computations are not done on separate
processors -- even automatically -- come down to programming language features
(how easy is it to express or discover the independence, one way or another)
and real or perceived performance overhead. Hardware can be designed so that
synchronization overhead doesn't kill the benefits of parallelization. GPUs
are a case in point.

The world is going in the direction of N cores. We'll probably get something
like a mash-up of a GPU and a modern CPU, eventually. If C had been overtaken
by a less imperative, more data-flow-oriented language, such that everyone
could recompile their code and take advantage of more cores, maybe these
processors would have come sooner.

~~~
arghwhat
Rant time.

> "Legacy" code in this context is code that was written in the past and is
> not going to be updated or rewritten.

In that case, I would not say Legacy code is sequential. For the past few
decades, SMP has been the target where sensible/possible.

> At a fine-grained level, our compilers vectorize (i.e. parallelize) our
> code.

Vectorization is a hardware optimization designed for a very specific use-
case: Performing instruction _f_ _N_ times on a buffer of _f_input x N_ , by
replacing _N_ instantiations of _f_ by a single _fN_ instance.

If this is parallelization, then an Intel Skylake processor is already a
_massively_ parallel unit, which each core already executing massively in
parallel by having the micro-op scheduler distribute across available
execution ports and units.

In reality, vectorization has very little to do with parallelization.
Vectorization is _much_ faster than parallelization (in many cases,
parallelization would be slower than purely sequential execution), and in a
world where all the silicon budgets goes to parallelization, vector
instructions would likely be killed in the process. You can't both have absurd
core counts _and_ fat cores. If you did, it would just be adding cores to a
Skylake processor.

(GPU's have reduced feature sets compared to Skylake processors not because
they don't _want_ the features, but because they don't have room—they just
specialize to save space.)

> At a coarser level, even if calculating a SHA (say) isn't parallelizable,
> running a git command computes many SHAs.

And this is exactly why Git starts worker processes on all cores whenever it
needs to do heavy lifting.

This has been the approach for the past few decades, which is why I twitch a
bit at your use of "legacy" as "sequential": If a task can be parallelized to
use multiple cores (which is not a language issue), _and_ your task is even
remotely computation expensive, then the developer parallelize the problem to
use all available resources.

However, if the task is simple and fast already, parallelization is
unnecessary. Unused cores are not _wasted_ cores on a multi-tasking machine.
Quite the contrary. Parallelization has an overhead, and that overhead is
taking cycles from other tasks. If your target execution time is already met
on slow processors in sequential operation, then remaining sequential is
probably the best choice, _even_ on massively parallel processors.

Git has many commands in both those buckets. Clone/fetch/push/gc are examples
of "heavy tasks" which utilize all available resources. show-ref is obviously
sequential. If a Git command that is currently sequential ends up taking
noticable time, _and_ is a parallelizable problem (as in, computing thousand
independent SHA's), then the task would be parallelized very fast.

Unless something revolutionizing happens in program language development, then
it will always be an active decision to parallelize. Even Haskell require
explicit parallelization markers, despite being about as magical as
programming can get (magical referring to "not even remotely describing CPU
execution flow").

> Hardware can be designed so that synchronization overhead doesn't kill the
> benefits of parallelization. GPUs are a case in point.

I do not believe that this is true at all. That is, GPU's do not combat
synchronization overhead in the slightest, lacks features that a CPU use for
efficient synchronization (they cannot yield to other tasks or sleep, but only
spin), and run at much lower clocks, emphasizing inefficiencies.

After reading some papers on GPU synchronization primitives (this one in
particular:
[https://arxiv.org/pdf/1110.4623.pdf](https://arxiv.org/pdf/1110.4623.pdf)),
it would appear that GPU synchronization is not only no better than CPU
synchronization, but a total mess. At the time the paper was written, it would
appear that the normal approach to synchronization were hacks like terminating
the kernel entirely to force global synchronization (extremely slow!) or just
using spinlocks, which are _way_ less efficient than what we do on CPU's. Even
the methods proposed by that paper are in reality just spinlocks (the XF
barrier is just a spinning volatile access, as GPU's _cannot sleep or yield_
).

All this effectively make a GPU _much_ worse at synchronizing than a CPU. So
why are GPU's fast? Because the kind of tasks GPU's were designed for do not
involve synchronization. This is the best case parallel programming scenario,
and the scenario where GPU's shine.

I'd also argue that if GPU's _had_ a trick up their sleeve in the way of
synchronizing cores, Intel would have added it to x86 CPU's in a heartbeat, at
which point synchronization libraries and language constructs would be updated
to use this if available. They don't hesitate with new instruction sets, and
the GPU paradigm is not actually all that different from a CPU.

> The world is going in the direction of N cores. We'll probably get something
> like a mash-up of a GPU and a modern CPU.

It's the only option, due to physics. If physics didn't matter, I don't think
anyone would mind having a single 100GHz core.

However, it won't be a "mash-up of GPU and a modern CPU", simple due to a GPU
not being fundamentally different from a CPU. A GPU is mostly just have
different budgeting of silicon and more graphics-oriented choice of execution-
units than a CPU, but the overall concept is the same.

> If C had been overtaken by a less imperative, more data-flow-oriented
> language, such that everyone could recompile their code and take advantage
> of more cores, maybe these processors would have come sooner.

A language that could automatically parallelize a task based on data-flow
analysis (without incurring a massive overhead) would be cool. I don't know of
any, though. I seems optimal for something like Haskell or Prolog, but neither
can do it.

However, tasks that would benefit from parallelization would already be easy
to tune to a different amount of parallelism, and parallelizing what is poorly
parallelized is not useful on any architecture.

Parallelization hasn't really been a problem for at least the last two
decades, and I certainly can't see it as the limiting factor for making
massively parallel CPU's. However, massively parallel CPU's are not magical,
and many problems cannot benefit from them at all. It will almost always be
trading individual task throughput for total task capacity.

------
agumonkey
The thing is, most of the time you're reflecting at some logical level that
will not be the "reality". The problem is that C programmer think that C ===
reality === performance. C has better (lower) constant factors but by no means
better all the time.

------
DannyB2
The sophistication of the compiler does not mean the language is high level.

The meaning of a high level language is to do with abstraction away from the
hardware. C programmers often wince at languages that are highly abstracted
away from the hardware. But those are what are "high level" languages.
Especially languages that remove more and more of the mechanical bookkeeping
of computation. Such as garbage collection (aka automatic memory management).
Strong typing or automatic typing. Dynamic arrays and other collection
structures. Unlimited length integers and possibly even big-decimal numbers of
unlimited precision in principle. Symbols. Pattern matching. Lambda functions.
Closures. Immutable data. Object programming. Functional programming. And
more.

By comparison C looks pretty low level.

Now I'm not knocking C. If there were a perfect language, everyone would
already be using it. Consider the Functional vs Object debate. (Or vi vs
emacs, tabs vs spaces, etc) But all these languages have a place, or they
would not have a widespread following. They all must be doing _something_
right for some type of problem.

C is a low level language. And there is NOTHING wrong with that! It can be
something to be proud of!

~~~
kiriakasis
One of the point of the article is that C is relatively high level by your
definition.

Basically it says that the C abstract machine has very little in common with
most existing processor.

moreover it makes the point that in the last decades of research for CPUs the
focus was "make C go fast" wich ultimately cause meltdown.

------
thinkling
TLDR: C was close-to-the-metal on the PDP-11 but since then hardware has
become more complex while exposing the same abstraction to the C programmer.
That means that hardware features such as speculative execution and L1/L2
caching are invisible to the programmer. This was the cause of Spectre and
Meltdown and it forces a lot of complexity into the compiler. GPUs achieve
high performance in part because their programming model goes beyond C.
Processors would be able to evolve if they weren't hamstrung by having to
support C.

~~~
occamrazor
I don’t understand. CPUs do not support C, they support a specific instruction
set. What stops them from having instructions for cache management,
pipelining, speculative execution hints, etc?

~~~
coliveira
They do not support C officially, but every CPU designer knows that 99%+ of
the code that matters is written in C. Therefore they design chips targeting
this translation from C. What the authors want is a better lower level
interface that would allow for modern processor features without the legacy of
the features available to the PDP11.

~~~
umanwizard
> 99%+ of the code that matters is written in C

I think a better way of stating this is "99% of the code that matters is
written in C, or in a language designed with a similar target architecture as
C in mind". Certainly a lot of code that matters is written in C++,
Objective-C, and Java, but the same points hold true for all of those.

------
sytelus
Interesting tidbits from article:

 _A modern Intel processor has up to 180 instructions in flight at a time (in
stark contrast to a sequential C abstract machine, which expects each
operation to complete before the next one begins). A typical heuristic for C
code is that there is a branch, on average, every seven instructions. If you
wish to keep such a pipeline full from a single thread, then you must guess
the targets of the next 25 branches._

 _The Clang compiler, including the relevant parts of LLVM, is around 2
million lines of code. Even just counting the analysis and transform passes
required to make C run quickly adds up to almost 200,000 lines (excluding
comments and blank lines)._

------
anfilt
I hate the idea of "low-level". There is not really such a thing. You should
be using a language suitable for the domain your working in.

Sadly, too many programming languages try to be the end all be all. C is
language that is great for working at the system domain.

Ideally, we would have small minimalist languages for various problem domains.
In reality maintaining and building high quality compilers is a lot work.
Moreover, a lot of development will just pile together whatever works.

That aside, you could build a computer transistor by transistor, but it's
probably more helpful to think at the logic gate level or even larger units.
Heck even a transistor is just a of piece of silicon/germanium that behaves in
a certain way.

So there are levels abstraction, but is an abstraction low-level? I think term
probably came about to refer lower layers of abstraction that build what ever
system your using. So unless your using something that nothing can be added
upon. Everything, even what people would call high level can be low-level.

Heck, people call JS a high level language, but there are compilers that
compile to JS. This makes a JS a lower level system that something else is
built upon. This just again shows why I would say that low-level is often
thrown around with connotation that is not exactly true.

------
judge2020
Archive.is link, as the page loaded incredibly slow for me:
[http://archive.is/E9s70](http://archive.is/E9s70)

------
plpot
I find this article insightful, but missing the points it tries to deliver.

What the article is very good at delivering is that current CPU's ISAs exports
a model that doesn't exist in reality. Yes, we might call it PDP-11, although
I miss that architecture dearly.

C was never meant to be a low level language. It was a way to map loosely to
assembler and provide some higher level abstraction (functions, structures,
unions) to write code that was more readable, and structured, than assembler.
And yes, it is far from perfect. And yes, today is called a low level language
with good reasons.

But this article is all about exposing the insanity that modern CPU have
become, insanity that is the sacrifice to the altar of backward compatibility
-- all CPU architecture that tried the path of not being compatible with older
CPUs have died.

I am pretty sure that once we'll have an assembler that map closely to the
microcode, or to the actual architecture of the internals of a modern,
parallel, NUMA architecture, we will still need to have a C-like language that
will introduce higher level features to help us ease writing of non-
architecture dependent parts. And it will most probably be C.

------
rhacker
The article itself has 4 definitions or "attributes" for low-level languages
that can be considered contradictory:

* "A programming language is low level when its programs require attention to the irrelevant."

* Low-level languages are "close to the metal," whereas high-level languages are closer to how humans think.

* One of the common attributes ascribed to low-level languages is that they're fast.

* One of the key attributes of a low-level language is that programmers can easily understand how the language's abstract machine maps to the underlying physical machine.

So basically the entire article's premise (the title) hinges on the last
bullet- which can be contested. All the other mentioned attributes can be
applied to Java, C, C#, C++. So failing the last bullet point doesn't apply to
just C.

~~~
dgreensp
I think the author's point is that despite being perceived as low-level, C
doesn't really differ from, say, Java on the last bullet.

In other words, a programmer who sits down and uses C and not Java might
think, "I am being forced to pay attention to irrelevant things and think in
unnatural ways, but that's because I am writing fast code using operations
that map to operations done by the physical machine. In a higher-level
language like Java, more of these details are out of my control because they
are abstracted away by the language and handled by the compiler."

I think the article does a great job dismantling this point of view, and
telling the story that C is not so different from Java, aside from being
unsafe and ill-specified.

~~~
zkomp
Maybe true but I think the Java example is not that good. Java is still not
that different from C. Java is more like a decendant to C and C++ - and to be
honest both languages force you to pay attention to lots of irrelevant "low-
level" detail, fictionally low-level since its not actually the machine but
language itself (that is stuck in the PDP11 mental mode...)

Compared to something different like Erlang, Haskell, Lisp

~~~
dgreensp
High-level and low-level are relative, to be sure, but Java is definitely
considered higher-level than C -- it was designed to target a virtual machine,
for example, while C was designed to target real machines -- so I think it
illustrates the article's point perfectly.

------
Const-me
One reason for that is for many applications latency is much more critical
than bandwidth. For PCs that’s input-to-screen latency, for servers that’s
request-to-response. It’s possible to make multicore processors with simpler
cores, design OS and language ecosystem around it, etc. Such tradeoffs will
surely improve bandwidth, but will harm latency.

Another reason is most IO devices are inherently serial. Ethernet only has 4
pairs, and wifi adapters are usually connected by a single USB or PCIx lane.
If a system has limited single threaded (i.e. serial, PDP11-like) performance,
it gonna be hard to produce or consume these gbits/sec of data.

------
zwieback
Great article if you're willing to read past headlines. I would have liked to
see a mention of small processors that are still hugely popular
(microcontrollers, etc.) where C is still a good fit.

------
wglb
The article does not properly distinguish between C as a language and what the
C compiler does with the C program. The logic of the article references what
the compiler does.

The reasonable way to measure languages is to look at the abstractions present
in the language. C has fewer abstractions than the other languages that we are
familiar with. That is the reasonable definition of the level of a language.

~~~
favorited
That's exactly the author's point. The C that programmers write is remarkably
far from what the compiler generates for modern hardware.

How do you propose measuring the number of abstractions? JavaScript has
remarkably few built-in abstractions, but it's in no way "low-level" from a
hardware perspective.

------
z3t4
I wonder if it's easier for a compiler/cpu to optimize "async" code ? And I
often find myself having an array in JavaScript that calls the same function
on each item in the array, it would be nice if such cases would be made
parallel, which I think is possible to do in C++. Is that ever gonna happen in
JavaScript !?

------
Shikadi
Language evolves. C is certainly lower level than C# or JavaScript, so even if
it no longer fits the definition created decades ago, I don't see a problem
with the term evolving to match modern times. People say assembly language
when they mean assembly language, (which others have argued isn't low level
any more anyway) so using low level to describe a language closer to the
hardware seems valid to me. It's interesting that the author argues C could be
considered low level on the PDP-11, because by the old definition used back it
definitely wouldn't be. That tells me the author's definition of low level is
already an evolution of the original definition, so there's no reason the term
can't evolve some more.

Wiki definition:

"A low-level programming language is a programming language that provides
little or no abstraction from a computer's instruction set
architecture—commands or functions in the language map closely to processor
instructions. Generally this refers to either machine code or assembly
language."

~~~
lmm
The whole point of the article is that by a definition like the one you
quoted, modern C is not low-level, though PDP-11 era C was.

~~~
richardwhiuk
That's not true - PDP-11 era C isn't either - if you run it on a modern
processor. And it's doubtful it even was then.

~~~
lmm
> PDP-11 era C isn't either - if you run it on a modern processor. And it's
> doubtful it even was then.

With a modern compiler C isn't low-level. But under PDP-11 era C compilers, C
really did map closely to processor instructions.

~~~
Shikadi
Not closely enough that it would have been considered low level at the time,
if you look at the instruction set C isn't even close to one-to-one with the
machine instructions. See
[https://en.m.wikipedia.org/wiki/PDP-11_architecture#Myth_of_...](https://en.m.wikipedia.org/wiki/PDP-11_architecture#Myth_of_PDP-11_influence_on_programming_languages)
for why people might mistakenly think otherwise

------
hokus
[http://web.archive.org/web/20180502001551/https://queue.acm....](http://web.archive.org/web/20180502001551/https://queue.acm.org/detail.cfm?id=3212479)

------
qsdf38100
"processors wishing to keep their execution units busy running C code" What?
This is non sense, the processor is not running C code! The processor can only
run machine code, regardless of the language used to write the source code.

~~~
monocasa
Eh, the past thirty years has had CPUs designed to run C. That's the whole
point of RISC: the idea of 'let's just pare down the CPU to what actually gets
compiled, and now we have less gates in the critical path and we can run our
chips faster'.

~~~
qsdf38100
hmmm, I think that CPUs instruction sets inherit mainly from the 8086, and
[https://en.wikipedia.org/wiki/Intel_8086](https://en.wikipedia.org/wiki/Intel_8086)
mentions few languages that had influence on the 8086 design, but don't
mention C at all.

------
smadge
Makes me wonder if x86 could be extended to expose the underlying
parrellelism. How much faster would my Prolog and Haskell programs run if all
branches were executed simultaneously and only the successful path down my
search tree returned?

~~~
tome
Probably not much faster, otherwise you would have just implemented that by
hand.

------
sriku
I couldn't read the article, but based on the comments, would it change the
way we use C whether we declared it a "low level language" or not?

~~~
Sean1708
The article is actually about how closely C maps to what is actually run on
the hardware and whether hardware would look significantly different today if
people didn't expect C to map closely.

------
justicezyx
Statement of using adjective almost always is about defining the context.

------
burke
C is Not a True Scotsman

~~~
mar77i
Damn Scots! They ruined Scotland!

------
waynecochran
Ok, w/o dipping into machine code, show me a low level language. Any snippet
of C-code is transparent in that you know roughly how it is going to be
translated into machine code.

~~~
umanwizard
> Any snippet of C-code is transparent in that you know roughly how it is
> going to be translated into machine code.

This isn't really true on a modern optimizing compiler.

~~~
phamilton
Precisely. As someone who's tried to duel a compiler for performance, -O3 has
very little resemblance to anything I've ever written, and outperforms what
I've written significantly.

------
sigjuice
Where does it say in the ISO C standard that C must be translated to assembly
code or machine code of any sort?

EDIT: Various C interpreters exist

------
11thEarlOfMar
>403 Error - Access Forbidden We are sorry ... ... but we have temporarily
restricted your access to the Digital Library. Your activity appears to be
coming from some type of automated process. To ensure the availability of the
Digital Library we can not allow these types of requests to continue. The
restriction will be removed automatically once this activity stops.

We apologize for this inconvenience.

Please contact us with any questions or concerns regarding this matter:
portal-feedback@hq.acm.org

The ACM Digital Library is published by the Association for Computing
Machinery. Copyright � 2010 ACM, Inc.

~~~
suprfnk
Got that too. Interesingly enough, these automated processes were able to get
through:

[https://web.archive.org/web/20180501183242/https://queue.acm...](https://web.archive.org/web/20180501183242/https://queue.acm.org/detail.cfm?id=3212479)

[https://webcache.googleusercontent.com/search?q=cache:sClfdA...](https://webcache.googleusercontent.com/search?q=cache:sClfdAKpYXcJ:https://queue.acm.org/detail.cfm%3Fid%3D3212479+&cd=1&hl=en&ct=clnk&gl=nl)

------
lowken10
If the general public & tech community refers to C as a low level language
then it is a low level language.

~~~
pjmlp
When a lie gets repeated enough times, eventually it becomes a fact.

~~~
sametmax
It's just practical. Otherwise how do you call java ? Or python ?

And what would be the benefit of changing those particular sementics ?

In french we define such article as "fucking a fly".

~~~
mr_toad
Java and Python are much closer to C than C is to assembly.

Managing memory and raw pointers are nothing. Try implementing a recursive
function call in x86 assembler to get a real idea of low level.

~~~
vardump
> Try implementing a recursive function call in x86 assembler to get a real
> idea of low level.

Isn't that pretty easy? Recursion is just a function calling itself, directly
or indirectly.

Here's the simplest possible recursive function in x86 assembler. It simply
causes a stack overflow.

    
    
      recursion:
        call recursion
    

That's it.

Tail recursive version would be simply a jump:

    
    
      recursion:
        jmp recursion
    

Here's the simplest terminating version I could think of:

    
    
      recursion:
        dec eax
        jz rec_exit
        call recursion
      rec_exit:
        ret

~~~
mr_toad
A call is not a function; functions have parameters and return values, and
you’ll need at least one local variable. And you need to save the values of
all the registers if you don’t want the called function overwriting them.

You’ll need to manually push that all onto the stack with each call. There’s
no compiler to do all that work for you.

~~~
vardump
If I have just one parameter in eax/rax and return value in eax/rax, why would
I need a stack frame? It's not like any registers except eax/rax (and flags)
are modified anyways. And that function doesn't call anything else that could
modify eax/rax.

A call is a function as long as caller and callee agree on the calling
convention. Non-exported functions don't necessarily need to adhere to
platform ABI.

Generally you only push registers on stack when you need to modify more
variables than what fit in your calling convention "thrashable" register set,
or when you save register contents to call other function, or to push function
call parameters on the stack.

I do embedded systems & low level drivers as my dayjob. Not a stranger to
writing assembler routines.

------
arseraptor
Ah yes, David Chisnall. Another Cambridge wannabe without a hope of tenure
track who thinks he is cleverer than he really is and makes a bunch of trite
points over and over hoping to get some attention -- not realising they've
been made for over 20 years. Have an original thought David, and stop feeling
smug. You're not.

~~~
dang
We've banned this account for repeatedly violating the site guidelines.

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

------
julienfr112
Ok. But what are the alternatives that are not decade away ?

------
_pmf_
It's low level, but the level is not identical to the machine level.

------
retrogradeorbit
It's all relative. Lower level than what? Higher level than what? C is lower
level than a huge number of other languages so I would feel comfortable
calling it 'low level'.

~~~
Avshalom
Relative to dozens of years of "portable assembly" and "C makes you understand
how a computer works" and "C is efficient because it maps to almost 1:1 with
CPU operations" and a jillion of related claims.

~~~
jacquesm
Depending on your hardware that is still the case. There are plenty of
embedded systems where these claims still hold. It's not really C that has
changed (though the language has evolved a little bit), it's the hardware that
changed and the implementation of the language.

~~~
cestith
That's the main thrust I got from the article. It does depend on the hardware,
and for mainstream desktop and server hardware it no longer maps well to what
the machine is doing.

------
emilfihlman
This article is completely clickbait.

C is low level. For example, with AVRs everything you do maps very clearly to
what happens as opcodes.

It's like the author wants to blame C for whatever reason and conveniently
forgets that C is also portable.

~~~
cestith
The author isn't blaming C. C has stayed largely the same. The author is
saying that Intel and AMD have - unlike PIC, AVR, and such - hidden the
machine from C so thoroughly that it's no longer a low-level language for that
platform.

------
laythea
The title is not a well formed statement. It all depends on what you are used
to. IE. If I write Java, C is low level. If I write assembler, C is high
level.

------
skylyrac
This doesn't make any sense. This would mean that my C code compiled for a
Cortex-M0 is low level, but for my x86 laptop is not. Or even more stupid,
that the same assembly code running in an old 386 is low level, but for an i7
isn't.

Low level is about how close to talking to the CPU you are, not about how
close to the silicon you are. The CPU is a black box and the programmer
communicates with it. What that box does inside doesn't matter.

~~~
sigjuice
Also, various C interpreters exist where there is no explicit C —> assembly
translation.

------
mar77i
As far as I understood what I do about C, is that most of C's here called
"quirks" have actually been enablers for much of the portability and
performance of modern platforms. Therefore I don't like "undefined behavior"
and the like being criticised for being such a "hindrance". I hence doubt the
author's familiarity with C is much beyond the basics, which kind of makes the
case for why the author also had to namedrop Spectre and Meltdown, which were
caused by the fact that later optimizations were unsound, ie. the Tomasulo
algorithm.

The problematic with the article somewhat remind me of the problems with
LCTHW, and that the author of LCTHW was unable to figure out what the deal was
about had been admitted by themselves.
[https://zedshaw.com/2015/01/04/admitting-defeat-on-kr-in-
lct...](https://zedshaw.com/2015/01/04/admitting-defeat-on-kr-in-lcthw/) Sorry
to re-repost this article again. I just somewhat perceive two variants same
"smells" in both.

~~~
dikaiosune
from the article:

> David Chisnall is a researcher at the University of Cambridge, where he
> works on programming language design and implementation. He spent several
> years consulting in between finishing his Ph.D. and arriving at Cambridge,
> during which time he also wrote books on Xen and the Objective-C and Go
> programming languages, as well as numerous articles. He also contributes to
> the LLVM, Clang, FreeBSD, GNUstep, and Étoilé open-source projects, and he
> dances the Argentine tango.

If tango experience isn't enough to make his opinion credible, I imagine being
an LLVM and Clang contributor are pretty good qualifications.

~~~
mar77i
I don't understand how someone who ended up working on a C compiler feels
intimidated by the standard/-s such software needs to adhere to. And if such
people work on a C compiler, we're not logically in for a ride of WTFs?

