
To become a good C programmer (2011) - 6581
http://fabiensanglard.net/c/
======
fabiensanglard
I feel like I wrote that a decade ago. Wait. Damn. It WAS a decade ago.

Amazingly the list is still relevant. I will try to check out the new
suggestions on this thread.

~~~
wmwragg
I've found "Understanding and Using C Pointers" by Richard Reese, a really
wonderful book.

~~~
ExtremisAndy
So glad to see this little gem mentioned! I have many books on C, but this is
the only one I've ever read cover-to-cover. Clear and to the point, and no
fluff/silliness. One of my favorite programming books ever!

------
ternaryoperator
In the 90s, most seriouc C programmers knew about these books. They were the
core books most devs relied on and referred to. They're all good.

I disagree with this comment: "No website is as good as a good book. And no
good book is as good as a disassembly output."

There are many websites today that are excellent. And there are many websites
that cover obscure topics not found in books.

Finally, I think it's important to move past C89 to at least C99. I realize in
2011 (the date of this post), that was less feasible. But today, there is
little reason for most projects not use the advances found in C99.

~~~
imron
> But today, there is little reason for most projects not use the advances
> found in C99.

Does visual studio support c99 yet? Last I heard MS wasn’t interested in
supporting it.

~~~
hazeii
Really? No longer an MS user but occasionally my code gets ported to Windows
so would like to know more.

~~~
dfox
MSVC in C mode is essentially C89 plus extensions needed to compile
C++/C99-isms present in Windows SDK and DDK headers. And Microsoft seems to be
mostly interested in whether the compiler is able to build NT kernel and
drivers and nothing much else.

~~~
nwellnhof
That's not true for quite some time. Visual Studio 2013 added many C99
features that aren't in C++ like compound literals or designated initializers.

------
fao_
It's unfortunate he doesn't list the POSIX system interface, which has
personally become an absolute boon:

[https://pubs.opengroup.org/onlinepubs/9699919799/](https://pubs.opengroup.org/onlinepubs/9699919799/)

You can search it using duckduckgo with !posix, too.

Something else I've found infinitely useful when digging into musl libc or
doing assembly programming, is the SYSV x64 ABI:
[https://refspecs.linuxfoundation.org/elf/x86_64-SysV-
psABI.p...](https://refspecs.linuxfoundation.org/elf/x86_64-SysV-psABI.pdf)

~~~
chandlore
Thanks for the link to the POSIX docs, looks very useful. More recent specs
for the x64 ABI can be found at [https://github.com/hjl-
tools/x86-psABI/wiki/X86-psABI](https://github.com/hjl-
tools/x86-psABI/wiki/X86-psABI)

------
wwweston
One of the things I tend to think about these days is return on investment. I
spent several years at the beginning of my career being a bad and then
mediocre C programmer, and once I found a few other languages, I got the sense
that being a mediocre-to-good programmer in these languages would be _much_
easier, and that seems to have been borne out.

Late into a career investing in other areas, what's the advantage of becoming
a good C programmer? Especially in a time where Rust and Go are viable
options?

~~~
jason0597
Rust isn't viable for embedded platforms, at least not yet. It's not as easy
to compile it to the most obscure ISAs as C, and the little support it has for
stuff like STM32 is restricted to just that group of microcontrollers and
doesn't support the entire ARM range. Maybe in the future? I'm looking forward
to that day!

Go is probably never going to run on microcontrollers due to its very big
overhead.

C perfectly matches on top of the hardware of a processor. Every single design
decision about C was made with the computer in mind. Memory, pointers, stack,
call stack, returns, arguments, just everything is so excellently designed.

Even Linus Torvalds says so!
[https://www.youtube.com/watch?v=CYvJPra7Ebk](https://www.youtube.com/watch?v=CYvJPra7Ebk)

If I were to make one change to C, it would be to completely rip out the
#include system and bring a proper modules system. Apart from that, it's
pretty much perfect.

~~~
zenhack
> Every single design decision about C was made with the computer in mind.

The only problem is that computers have changed a bit in the last 50 years,
and C largely hasn't. There are a couple issues:

First, C was designed for single-pass compilers, because the PDP-7 it was
designed for was too small to actually run much fancier of a compiler. So C is
seriously sub-optimal for optimization in a lot of ways (because the
assumption was you weren't going to do compiler optimizations anyway), and
there are some user-visible warts like forward declarations that are
completely unnecessary today.

Second, the relevant questions with regard to CPU performance have changed a
lot. Most notably:

\- CPU performance has completely outstripped memory perf, so memory
hierarchies and locality are _everything_

\- Parallelism everywhere. Multiple cores, but also deeper instruction
pipelines and other such things.

The way those things map to C is completely implicit; they don't show up in
the language at all, and getting the machine to do what you want requires
knowing things that the code wouldn't suggest at all.

I think if the same people had designed a language for a similar niche today's
hardware, a lot of things would be different.

~~~
nomel
> CPU performance has completely outstripped memory perf, so memory
> hierarchies and locality are everything > they don't show up in the language
> at all... requires knowing things that the code wouldn't suggest at all

I'm utterly confused at this.

It's trivial to layout memory as you please, where you please, very directly,
in C. Set a pointer to and address and write to it. Better yet, I can define a
packed struct that maps to a peripheral, point it to its memory address from a
data sheet, and have a nice human readable way of controlling it:
MyPIECDevice.sample_rate = 2000.

Keeping things physically close in memory has always been a strong
requirement, as long as cache, pages, and larger than one-byte-memory buses
have existed.

~~~
zenhack
> Set a pointer to and address and write to it. Better yet, I can define a
> packed struct that maps to a peripheral, point it to its memory address from
> a data sheet, and have a nice human readable way of controlling it:
> MyPIECDevice.sample_rate = 2000.

Just make sure you don't forget `volatile` in the right places. A lot of
codebases end up just using their own wrappers written in asm for this kind of
thing, because the developers have learned (rightly or wrongly) not to trust
the compiler.

To be clear, it's not _that_ hard to get the memory layout semantics you want
in C. But issues around concurrent access, when it is acceptable for the
compiler to omit loads & stores, whether an assignment is guaranteed to be a
single load/store or possibly be split up (affects both semantics in the case
of mmio and also atomicity), are all subtle questions, the answers to which
are not at all suggested by the form of the code; The language is very much
designed with the assumptions that (1) memory is just storage, so it's not
important to be super precise on how reads and writes actually get done (in
fairness, the lack of optimization in the original compilers probably made
this more straightforward), and (2) concurrent access isn't really that
important (the standard was completely silent on the issue of concurrency
until C11). If you care about these issues there's a lot of rules lawyering
you have to do to be sure your code isn't going to break if the compiler is
cleverer than you are. A modern take on C should be much more explicit about
semantically meaningful memory access.

I think you can make a sensible argument that wrt _hierarchies_ C is at least
not a heck of a lot worse than the instruction set, so maybe I'm conceding
that point -- though the instruction set hides a lot that's going on
implicitly too. Some of this though I think is the ISA "coddling" C and C
programs; in a legacy-free world it might make more sense to have an ISA let
the programmer deal with issues around cache coherence. I could imagine some
smartly designed system software using the cache in ways that can't be done
right now (example: a copying garbage collector with thread-local nurseries
that are (1) small enough to fit in cache (2) never evicted and (3) never
synced to main memory, because they're thread-local anyway). Experimental ISA
design is well outside my area of competency though, so it's possible I'm
talking out of my ass. But the general sentiment that modern ISAs hide a lot
from the systems programmer and that other directions might make sense is
something that I've heard more knowledgeable people suggest as well.

~~~
madmax96
>If you care about these issues there's a lot of rules lawyering you have to
do to be sure your code isn't going to break if the compiler is cleverer than
you are.

>A modern take on C should be much more explicit about semantically meaningful
memory access.

If you are working on concurrent code close to the hardware you’re going to
either have to accept a less efficient language or engage in rule lawyering.
Unfortunately, granting the compiler license to perform the most mundane
optimizations interferes with concurrent structures. Fortunately, with C there
are rules to lawyer with, and they actually are simple. No matter what, rules
will always need learned.

------
aphextron
Why does everyone always suggest K&R? I really don't see the big deal. It's
short, contains only trivial toy examples with no real world application, and
touches almost nothing on actual project architecture or best practices. I
bought it expecting to learn to write programs in C but it's really just a
reference manual on syntax.

~~~
aikah
> Why does everyone always suggest K&R?

Because it's the best no bullshit book to learn C, period. Especially coming
from garbage collected languages. It's not going to teach you valgrind or GDB,
or even compile and link C programs but it will certainly teach you how to
write C programs.

> only trivial toy examples with no real world application

Re-implementing POSIX commands is writing real world applications.

> and touches almost nothing on actual project architecture or best practices.

On what platform? Linux? Windows? Mac? tool chains are so different on all
these platforms that it wouldn't make sense to spend time on that. That books
is about C, not learning autotools and other horrors.

------
yesenadam
This is a great list of books - at least, I found the same ones were the most
excellent. Also I really learnt a lot from _21st Century C_ (The author here
said they wanted to stick to C89, which is fair enough.)

>To read great C code will help tremendously.

But then he just gives examples of games I don't know and am not really
interested in. Anyone know some non-game C that is, I suppose, well-known and
wonderfully-written?

Also, am learning about writing APIs, any good books or talks about that
people could recommend, or examples of particularly good ones in C? (I'm
particularly interested in maths/graphics-type stuff.) Thanks!

~~~
eindiran
The sqlite codebase is well-known for being a large, well-written C codebase:

[https://github.com/sqlite/sqlite](https://github.com/sqlite/sqlite)

The tests in particular are very impressive.

Some other notable C codebases: Redis, LuaJIT, FreeBSD, Memcached -->

[https://github.com/antirez/redis](https://github.com/antirez/redis)

[https://github.com/LuaDist/luajit](https://github.com/LuaDist/luajit)

[https://github.com/freebsd/freebsd](https://github.com/freebsd/freebsd)

[https://github.com/memcached/memcached](https://github.com/memcached/memcached)

~~~
jacquesm
Varnish and the Linux kernel should be in that list as well I think.

~~~
eindiran
Nice! Here are the links:

Varnish --> [https://github.com/varnishcache/varnish-
cache](https://github.com/varnishcache/varnish-cache)

Linux Kernel -->
[https://github.com/torvalds/linux](https://github.com/torvalds/linux)

------
anaphor
"C Interfaces and Implementations" is another one that comes highly
recommended.

[https://sites.google.com/site/cinterfacesimplementations/](https://sites.google.com/site/cinterfacesimplementations/)

~~~
sn9
This book is fantastic to tackle for anyone comfortable with C at the level of
K&R. A great second book on C.

And one of the few examples of literate programming one might come across!

------
AceJohnny2
The K&R book was for decades my gold standard of a programming book. Perfectly
clear and concise, its one notable failing being that it's obsolete.

The Rust Programming Language [1] has since equalled and surpassed it in my
mind. Hats off for Steve Klabnik & Carol Nichols :)

(note, I'm not saying Rust is a better answer than Fabien's recommendation,
just that _its book_ is as high quality as the excellent K&R)

[1] [https://doc.rust-lang.org/book/](https://doc.rust-lang.org/book/)

------
markus_zhang
I think for us laymen, the most difficult thing is to find a good reason to
use C frequently. I have a few "legitimate" (in the sense that C is actually
suitable for those) non-embedded projects that I'd like to take on in the
future when I have time to pick up C:

1 - Recreate some of the Linux command line tools, including a command line
interface similar to BASH;

2 - Implement a fully functional compiler or byte-code Virtual Machine for a
stripped down scripting language;

3 - Write a key-value data store

Sadly I really can't name any project that is not system programming. I saw
some projects that use C for backend (CGI style?) programming or game
programming but I believe there are more suitable tools for either of them.

BTW I also found it very clear to use C or a stripped down version of C++ to
teach myself Data structure.

~~~
yters
C + knowing the Linux API and what is done in kernel vs user space can lead to
very efficient Linux software. Since much consumer software these days is a
webapp with a Linux backend, this sort of embedded optimization can make good
consumer sense.

~~~
markus_zhang
It's a pity that it's almost impossible for a non programmer to find a C/C++
job nowadays. Much of the deeper knowledge takes too much time and effort to
master from outside of the field so I guess most people don't go this way.

------
commandlinefan
> No website is as good as a good book

Good (and often forgotten) advice for more than just C...

~~~
Touche
I don't agree with this sentiment. People learn different ways. I personally
learn through trial and error, and explanations of things that I haven't been
able to "touch" yet don't help me at all.

~~~
WoodenChair
> I don't agree with this sentiment. People learn different ways. I personally
> learn through trial and error, and explanations of things that I haven't
> been able to "touch" yet don't help me at all.

Learning through trial and error can be okay for custom problems that don't
have well defined solutions. For simple problems, it's highly inefficient.
Extreme example: if I want to print something to the console in C, I can use
trial and error for a half-hour to try to understand the nuances of how to use
printf(), or I can read for a few minutes a few pages in a book or an (good)
online tutorial. Not only will I learn faster, I will also learn best
practices.

My experience has been that people who are really against learning through
books/tutorials tend to have short attention spans. Which is okay—books don't
work for you. But for people who do have the ability to sit and read, it can
be much more efficient than struggling through trial-and-error.

Somebody has already solved your problems, and they can tell you in a few
pages how you can solve them too.

------
matheusmoreira
> The Standard C Library

> Or how errno came to existence?

This is interesting. According to this book, errno was created because they
wanted system calls to work like ordinary C functions:

> Each implementation of UNIX adopts a simple method for indicating erroneous
> system calls.

> Writing in assembly language, you typically test the carry indicator in the
> condition code.

> That scheme is great for assembly language. It is less great for programs
> you write in C.

> You can write a library of C-callable functions, one for each distinct
> system call.

> You'd like each function return value to be the answer you request when
> making that particular system call.

This turned out to be unnecessary. For example, many Linux system calls return
a signed size_t type with either the result or the negated errno constant.
When an error occurs, the C library stub function simply negates that value
and assigns it to errno which is likely a _thread-local_ variable.

The C standard library provides many examples of bad design. All the hidden
global data, the broken locale handling, the duplication found in so many str*
and mem* functions... Freestanding C lacks most of the standard headers and is
a better language for it. Understanding the historical context that
contributed to these designs is very useful since it allows new languages to
avoid repeating these mistakes.

------
aungmyohtet
>> Expert C Programming . This book is fantastic because it will bring your
attention to what happens under the hood in a very entertaining way. I am not
a C programmer nor was trying to become one. Just trying to be familiar with
low level programming and computer internals in general. But I enjoyed reading
the book a lot. One of the the stories I remember is about the programming
contest at Carnegie-Mellon University. The rules were simple that the program
had to run as fast as possible and the program had to be written in Pascal or
C. Here are some paragraphs extracted from the book about the result. “The
actual results were very surprising. The fastest program, as reported by the
operating system, took minus three seconds. That's right—the winner ran in a
negative amount of time! The next fastest apparently took just a few
milliseconds, while the entry in third place came in just under the expected
10 seconds. Obviously, some devious programming had taken place, but what and
how? A detailed scrutiny of the winning programs eventually supplied the
answer. The program that apparently ran backwards in time had taken advantage
of the operating system. The programmer knew where the process control block
was stored relative to the base of the stack. He crafted a pointer to access
the process control block and overwrote the "CPU-time-used" field with a very
high value. The operating system wasn't expecting CPU times that large, and it
misinterpreted the very large positive number as a negative number under the
two's complement scheme. “

~~~
markus_zhang
This reminds me of Mel the real programmer. I'll quote:

Mel loved the RPC-4000 because he could optimize his code: that is, locate
instructions on the drum so that just as one finished its job, the next would
be just arriving at the “read head” and available for immediate execution.
There was a program to do that job, an “optimizing assembler”, but Mel refused
to use it.

------
makecheck
Using C as a lowest-common-denominator for game engines makes sense but I do
_love_ having Objective-C (despite the platform limitations). It just makes
expressing game logic so easy, with protocols, etc. and I would not want to do
it all with just C.

------
pmiller2
Interesting comment they made about using C89 for portability. What's the
state of C99 today? I know gcc doesn't support all C99 features, but what
about other compilers? Is it worthwhile to use C99 in a new project?

~~~
quelsolaar
C99 is well supported now, but Its has become increasingly clear that some of
the changes to the definition of undefined behavior in the spec has made som
very troubling compiler optimizations possible that can make C a lot less
dependable. Later versions of C also adopt a new memory model that is very
broken. memcopy is for instance impossible to implement in C17. Due to these
issues the Linux kernel is no longer using standrard C. So as it stands C89
remains a good choice.

~~~
kccqzy
> Later versions of C also adopt a new memory model that is very broken.

You mean, adopting a memory model that properly supports multi-threaded code?

> memcopy is for instance impossible to implement in C17.

What? I assume you mean memcpy? How is it impossible to implement given that
it's implemented in the standard library?

> Due to these issues the Linux kernel is no longer using standrard C.

As others have mentioned, the Linux kernel never used standard C. Just search
for, say, the substring "__builtin" where it uses compiler builtins directly.
It's also quite recent that Linux can be compiled by clang; if it were using
standard C, such compatibility issue would not arise in the first place.

~~~
quelsolaar
The new memory model tried to make it possible to implement fat pointers with
reference counters. That means that the compiler must be able to tell when a
pointer is being copied. After the spec was ratified several people pointed
out that its not possible to implement memcpy(typo sorry) since memcpy doesn't
know what it is copying and might therefore copy a pointer without increasing
the ref counter.

------
fredsanford
I consider [0] to be well written and fairly low fat.

0 [https://github.com/Genymobile/scrcpy](https://github.com/Genymobile/scrcpy)

~~~
sandov
I think you pasted the wrong URL.

------
kuharich
Prior discussion:
[https://news.ycombinator.com/item?id=11606296](https://news.ycombinator.com/item?id=11606296)

------
senderista
I would add "C Interfaces and Implementations" to his (excellent) list.

------
stebann
This is a good list because it isn't a gigantic compendium about C.

------
LessDmesg
C, the low-level language that cannot even get integer types right and
believes that -1 is greater than 1... I wish the Zig language people get some
industrial backers so C can finally start fading away to obscurity.

------
Gibbon1
> I picked C89 instead of C99

Stopped reading right there.

