Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do I learn C properly?
432 points by buchanae on March 8, 2020 | hide | past | favorite | 199 comments
I have 15+ years of experience in many languages: javascript, python, go, etc.

I can write C, but I'm not confident I'm doing it the right way. There are now decades of forums and blogs I can search through, but I'd love to have a single, clean, clear source of truth for recommendations on how to write C code.

I've started reading "Modern C" by Jens Gustedt. I'm not sure if this is the closest to a modern source of truth.

Note, I'd like my C code to work on Windows, which possibly restricts the "right way" to features provided by C89. But again, I'm not sure, the internet is super clear on this.

Thanks for the tips!




It sounds like you're not asking, "How do I learn the language?" but "How do I know I'm doing it right?"

I think Gustedt's book is superb if you're just trying to learn the language, and it does a good job of presenting the most up-to-date standard. I admire the K&R as much as anyone else, but it's not exactly an up-to-date language resource at this point, and it doesn't really provide any guidance on the design and structure of systems in C (that's not it's purpose).

You might have a look at Hanson's C Interfaces and Implementations: Techniques for Creating Reusable Software. That's focused on large projects and APIs, but it will give you a good sense of the "cognitive style" of C.

21st Century C is very opinionated, and it spends a great deal of time talking about tooling, but overall, it's a pretty good orientation to the C ecosystem and more modern idioms and libraries.

I might also put in a plug for Reese's Understanding and Using C Pointers. That sounds a bit narrow, but it's really a book about how you handle -- and think about -- memory in C, and it can be very eye-opening (even for people with a lot of experience with the language).

C forces you to think about things that you seldom have to consider with Javascript, Python, or Go. And yes: it's an "unsafe" language that can set your hair on fire. But there are advantages to it as well. It's screaming-fast, the library ecosystem is absolutely gigantic, there are decades of others' wisdom and experience upon which to draw, it shows no signs of going away any time soon, and you'll have very little trouble keeping up with changes in the language.

It's also a language that you can actually hold in your head at one time, because there's very little sugar going on. It's certainly possible to write extremely obfuscated code, but in practice, I find that I'm only rarely baffled reading someone else's C code. If I am, it's usually because I don't understand the problem domain, not the language.


Do any of these resources help with learning how to deal with:

1. The lack of abstractiom in C? 2. All of the pitfalls and gotchas: Undefined Behaviour, use after free, etc?

I've been writing Rust for a couple of years now, so I'm quite comfortable with low-level ideas, memory management, etc. But C still scares me because it seems so easy to make mistakes, and I also often find it hard to see the forest for the trees.

Perhaps C's just not for me?


I think both “Expert C” and “Modern C” do a reasonably good job working through memory management patterns, strategy, and constructs. However, I’m using this comment as an attempt discuss another topic you raised in your comment.

I don’t mean for this to be flame bait, but I do think it is worth pointing out and discussing if you desire. The concept of memory management that is presented in a majority of the Rust that I have seen written outside of deep library code and the parts of the low-level implementations of some standard library functions has very little relation to the analogous concepts in C. I’d say that what Rust considers memory management in average code is more similar to programming in C without using any heap allocation. I could be wrong or could just have not seen the right code bases, but there is very little C style byte level ‘data structure’ layout design and then the allocation of those sorts of data using multiple allocation strategies.

I certainly understand that the above mentioned constructs and patterns are really not part of Rust’s idiomatic usage, and that a chunk of C’s allocation options are semi-prohibited because of Rust’s memory model. But, if you are coming from Rust and going into C, the differences are far greater than the similarities in this area.

I’m certainly not questioning your credentials, experience, or ability, I really just feel like this area is lacking any substantial discussion where said discussion is not focused on the merits/drawbacks of the language design. You clearly know Rust and seem to be open to learning or working with C, so it isn’t about Rust v. C, it’s about compare and contrast, and highlighting the fundamental differences that could cause some mental roadblocks.

P.S. Sorry for rambling, Day Light Savings time is kicking my ass.


Not parent but Rust definitely makes you think about a lot of important aspects of memory management like lifetimes and references and ownership and stuff. In fact, it makes it impossible not to think about them: they're enforced by the compiler.

Those mechanics are pretty important but Rust definitely lets you program with values on the heap as well, it just does it in a strongly-typed fashion. I will agree that smart pointers are generally easier to work with, particularly with things like deref coercion, but Rust is philosophically closest to a language like C++ but with a stronger type system, memory safety, and lots of footguns removed.

I think Rust gives someone a better basis for learning and writing C than probably any other modern language.


I grokked C only after dabbling with Forth. C is too high level to know what's going on at the hardware level.


Most processors these days are basically running C virtual machines in hardware anyways.


> I could be wrong or could just have not seen the right code bases, but there is very little C style byte level ‘data structure’ layout design and then the allocation of those sorts of data using multiple allocation strategies.

My experience is mostly in very high level languages, C#, TypeScript, SQL, so that's probably reflected in the style of Rust code that I've been writing, but at least as one data point, yes, it is very possible to write day-to-day Rust code without getting into byte level data layout.

It does feel a lot like writing C code without using a heap. In fact the only case I've dealt with allocating uninitialised heap memory so far has been when interacting with a C API. Creating a new Vec does allocate memory on the heap (once values are pushed into it), but as a user it feels like passing around any other stack allocated struct. I think this is mostly due to the RIIA-style Drop and ownership system, as there is no extra code to free a value that owns a heap allocation vs a stack-only value.

The implementation of std::mem::drop is a fun example of this: https://doc.rust-lang.org/std/mem/fn.drop.html


Yes. Or, at least, the second part.

I'm not sure what you mean by "lack of abstraction in C."

"Abstraction" is not a straightforward property of a language; it's a thing you create with programming languages. It's true that some languages make it easy to express certain kinds of abstractions, but C has it's own way of abstracting things, and it's certainly not absolutely obvious that its way of doing so is inferior to others.


Sometimes making mistakes is a good way to learn, provided that you get feedback to help you identify them.

You can compensate for C's lack of compile time protections with run-time tools like valgrind [1] and the various sanitizers in clang [2, 3].

1: http://valgrind.org/

2: https://clang.llvm.org/docs/AddressSanitizer.html

3: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html


There is an old but great book called "C traps and Pitfalls" on that subject.

Otherwise you're left alone with yourself and your tooling. Things like valgrind help a lot. But I'll admit that I too find large C projects alarming; it may be better to write "C with classes" code in a C++ environment, so at least you have classes and namespaces as an organisational tool.


I recommend Bradley's 'programming for engineers: a foundational approach to C and Matlab'. Simple book with relevant exercises.


If a book on C is described as "simple" I bet it's not about doing things right (which is what OP is worried about.) Doubly so if it covers something in addition to C.


Do you think that van der Linden's Expert C Programming is still worthwhile?

There's also the comp.lang.c FAQ http://c-faq.com/ , which is also very old now but still seems like a no-brainer to me.


I think Expert C Programming is a great book (really, as much a classic as the K&R), but I'm not sure I'd put it on the OP's list. At least not right now.

It says something about C (I'm not sure what) that it's possible to learn quite a bit about C from a book that was written in 1994. But it was written before C99, and it shows.

If you're a C hacker and you haven't read it, definitely do. It's great.


I second every book recommendation here. Another great one is The Standard C Library.



> the library ecosystem is absolutely gigantic

I can call out to C libraries from other languages, so that's not a big deal.

In addition, I think that probably the JVM ecosystem is larger and that Python may be as well.

You also find certain things at the bottom (like glibc, for example) where the C ecosystem isn't as diverse as you think it is.


For Linux userland, there are also musl and uClibc-ng. Other platforms have their own libcs.


Another great way to learn about C pointers is to reverse engineer games, live memory editing and scanning in particular requires a lot of global/heap pointer detective work.


The proper way to learn C is to get a good book, read it, and do exercises along the way.

Here’s an excellent book: C Programming: A Modern Approach, by King.

Avoid online tutorials, and know that there are many poor books on C, including the recommended here “Learn C the Hard Way”. It has factual problems; see under “Stuff that should be avoided” here: http://iso-9899.info/wiki/Main_Page

Note also, that unlike some other languages, C is not a language that you should learn by “trying things out”, due to the nature of “undefined behavior”.

Another recommendation would be to ask for CR on IRC.

Good luck!


Since UB can be a bad beast to tackle while learning C and C++, I would suggest to frequently compare what you learn with cppreference.com, and check that you understand that what you think matches what the standard dictates. Cppreference is not the standard, of course, but is sufficiently similar and much easier to read.

If at some point you want to really become a master, then switch to the actual standard.


> Note also, that unlike some other languages, C is not a language that you should learn by “trying things out”, due to the nature of “undefined behavior”.

I disagree. You have to learn the ways in which things fail in any language you learn, especially a language like C.


The point is that you don't learn all the ways in which things fail in C by trying. At best, you learn the way in which they fail (or fail to fail!) on your specific platform/compiler/version/compiler options combo.

If you want to know how things can fail (or, what things can fail in unspecified ways), you have to do a bit of reading and no amount of trying and experimenting will conjure that knowledge.


There isn't a guide to learn how things fail on platform, compiler, version, options combo. Probably the only things that force you to systematically research and accumulate such knowledge is by writing a language VM in C or use C as an intermediate representation. Even writing a C compiler itself won't teach you that.


> There isn't a guide to learn how things fail on platform, compiler, version, options combo.

Yeah, that's the point. Chasing that knowledge is pointless. Instead, read the spec and find out what is legal and what is not. What is defined and what is not. Then you can stop worrying about how things fail and start worrying about not doing those things.


Trying c out with a known solid testing framework seems like a much more practical and useful way to start. Telling people to read a text wall before doing any practice at any task is a great setup for failure. Math doesn’t do that. Physics doesn’t do that. Chemistry doesn’t do that. Literature doesn’t do that. Philosophy doesn’t do that. Etc. and of those philosophy is almost the study of text walls. But the best philosophy courses tend to start with a Socratic dialogue which is basically live action philosophizing.


Well, we also have lots of historic evidence (nearly every application written in C ever) that traditional/intuitive teaching methods don't work for C.

All people have learned from the last 40 years of C is how to write ridiculously insecure applications.

Reading the standard may not be the optimal choice, or even the first choice, but common teaching methods are nearly guaranteed to produce crap results in the specific case of C programming.


I broadly agree. More so than any other other language, you cannot produce a good C (or C++) programmer by try-it-and-see alone. It's vital to have a grasp of the way the language is defined.

Point of disagreement: no amount of C expertise makes a hand-written C codebase safe. Vulnerabilities and undefined behaviour are often found even in code written by top-flight C programmers.


To me it's really both. Do one first, notice it works sometimes and you're not sure whether you should publish it (at that point, don't :) -- ) and then look at your own product through the lense of what you're learning. Eg. there are different kinds of UB and different problems it causes. There's a lot of entertaining literature and people in certain communities that can get you on the right track.

One of them was certainly this one: https://devblogs.microsoft.com/oldnewthing/20140627-00/?p=63...


Yeow, the format conversion has completely eaten the formatting on this poor blog.

This is readable: http://web.archive.org/web/20140629013829/http://blogs.msdn....


Oh wow, sorry for not looking into this properly. Thanks for the fixed link.


I’m always skeptical of telling people to not just try something out in programming. Yes, don’t just try to code up the next unix without knowing any c. But definitely getting your hands dirty isn’t going to instill any super bad mental/coding practices.


You can learn by playing around, but its very helpful to have a guide in the begining.

I had to start coding for work in C and without some good footing it was frustrating dealing with older code and creating new things with my very limited experience. (I had c++ which isn't the same.)

Having spent a week going a book and getting up to speed helped greatly and increased my enjoyment, reduced my frustration fwiw.


[flagged]


Ok I’ll bite, what on earth is the connection between Uyghurs and C?


https://googleprojectzero.blogspot.com/2019/08/a-very-deep-d...

This is one of the few places where writing C is objectively reasonable (an OS kernel) - and also one of the places where it's particularly dangerous if you get things wrong.


Presumably the fact that iOS had vulnerabilities that allowed for watering-hole style zero-day attacks on Uyghurs in China.


By that point you should be well beyond just messing around with c.


Should be, yes, but again we have the evidence: the people whom Apple hires to write kernels regularly make these sort of mistakes. (Let alone the people that Microsoft hires, or the people who write various third-party Android drivers.) So whatever form of "should" that we have right now isn't working.


I work on Clang and the Linux kernel. JavaScript was the first language I feel I've mastered, before C. Of the many C books I have and have read, I only recommend 2:

Head First C by Griffiths and Griffiths

Expert C Programming: Deep C Secrets by van der Linden

The first is an excellent introduction, especially if you treat it as a workbook.

The second is a great intermediate book.

Really advanced stuff comes from seeing things in the wild.


Also, Jens' book is great, and I've learned from reading it, but it is way too long for a beginner.


An interesting way IME to learn C, is to code for an old platform like the Amiga.

All the same best practices apply, but the underlying OS is more simple and the constraints, like missing floating point, forces you investigate many aspects of C.

You can use Bebbo's m68k cross compiler to compile for the platform: https://github.com/bebbo/amiga-gcc

No single source of truth for C best practices exist, but I can recommend using static analyzers like:

https://splint.org/

and

http://cppcheck.sourceforge.net/

to steer you in the right direction.

Also, compile your code with the following gcc options: -pedantic -std=<c99 or c89> -ggdb3 -O0 -Wall -Wextra -Wformat=2 -Wmissing-include-dirs -Winit-self -Wswitch-default -Wswitch-enum -Wunused-parameter -Wfloat-equal -Wundef -Wshadow -Wlarger-than-1000 -Wunsafe-loop-optimizations -Wbad-function-cast -Wcast-qual -Wcast-align -Wconversion -Wlogical-op -Waggregate-return -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-declarations -Wpacked -Wpadded -Wredundant-decls -Wnested-externs -Wunreachable-code -Winline -Winvalid-pch -Wvolatile-register-var -Wstrict-aliasing=2 -Wstrict-overflow=2 -Wtraditional-conversion -Wwrite-strings

Not that they all necessarily always makes sense, but they will reveal weak points in your code.


Given the fact you already know some higher level languages I think the best way to learn C is to go low level.

C is fantastic language because it is within your reach to go and understand all small details of how various features are actually implemented. I find it very helpful to be able to understand how a piece of code actually functions on a low level.

Another tip: you are used to having high level features on your disposal. Don't try to duplicate those features. Figure out how actual C projects deal with program structure and flow issues that you are used to solving with high level constructs.


CPython is a particularly nice codebase to dig into. Plus, there's a big todo list: https://bugs.python.org


This^! +1

Also do it on a Raspberry Pi, it will allow to touch that low level better. Only flashing an LED through GPIO pins and is enough to grasp the basics.


C is a very simple language. I have used just the official old book to create a compiler for it:

https://www.amazon.com/Programming-Language-2nd-Brian-Kernig...

The best thing you can do is finding some open source code that interest you, read it and write it.

For example, I write my own controllers for CNCs, and there are lots of code for controlling stepper motors out there. I also learned OpenGL and sockets this way long time ago.

On the other hand, C is a terrible language for any big project unless you automate it some way.

I use mostly lisp in order to write C code automatically. C is too low level for writing any substantial code. Lisp has things like Macros(nothing to do with c macros) that can make you write code 10x, to 50x faster easily.


> C is too low level for writing any substantial code.

Do people not realize that projects like Linux and pretty much every car's ECU software (pretty "substantial" in my opinion) is written in C before claiming this? I'm not saying these software are 100% bug free, but this claim isn't accurate either. You _can_ write substantial C code when you have a good understanding of the tooling around it. Mere learning the syntax isn't enough.


I'm so tired of hearing about what you can't do in C and the "only thing" it's good for. You can't write big systems. It's only for embedded.

It's such a crackup, because the number of developers who haven't got the memo on this is really staggering. How many lines of code is GTK? Or Postgres? Or Nginx? Never mind languages and operating systems.

I suppose Vim (~335,000 lines) or git (in the 200,000 range) is not "large" in comparison to some things, but I suspect that's actually the kind of number people think is "too big for C."

People also seem not to realize that the reason that Python library is so fast, is because it's actually wrapping C code. And that code is very often not doing some frighteningly low-level thing, but just doing ordinary stuff.


As someone who works on a significant C codebase (IOS-XR), my 2 cents:

1. Most of the large userspace C codebases that still exist today didn’t have many alternatives when they were started.

2. Any non-trivial C codebase relies much more on runtime testing than other languages. In other words, with other languages, you can get away with fewer tests while achieving the same level of quality.

Note that you should be striving for comprehensive tests regardless of language; this is just an observation.


The "steel man" version of this argument is that, if you're embarking on a new codebase which you have good reason to expect will grow to be large, have a very good reason to write it in C, or use a more appropriate language.

There some reasons to do it, but fewer than there were when most of the projects you've name-checked were started.

I'll skip the part where we review what the other options are, this is HN, we've all seen that thread before.


I take your point about the age of some of these projects (though it's not like there weren't any "more appropriate" languages around when most of them started). And I'll also admit to agreeing with the idea that choice of language is nearly always a sociological matter before it's a technical one.

But I'm not really arguing that C is the best choice for large projects. I'm just pointing out the fact that many, many people just completely ignore that advice. They are ignoring that advice today -- right now. With new projects. And they've "seen that thread before" too. It's not just legacy, or lack of awareness, or whatever.

Maybe it comes down to what the "very good" reason actually is. Because given the amount of C hacking going on, I wonder if the "good reasons" we hear about nearly every day on HN aren't the most important ones for the quite vast number of people merrily hacking away in a language that is now decades old.


Knowing how some big software systems lived through more than a decade each, I'd claim that the less they used C++ and the more used C++ compilers to write a C-like code they were better. The worst parts of the project were exactly those that used the "cool" C++ features.

C is much better for really long-lived projects than C++ is. C++ is a maintenance nightmare in comparison.

Linux kernel is also a good and famous big long-lived project which I claim would have never even survived had it been written in C++. Linus seems to have believed the same.


> C++ is a maintenance nightmare in comparison.

c'mon, all the super large projects - adobe software, ableton live, blender, autodesk, 3DS max, maya... all the major game engines (UE, CryEngine, Frostbite, etc), are written in C++, not in C. LLVM & clang are C++, and GCC is all C++ in new code with a fair amount of old code ported from C. All the major web browsers are C++ - and I'd wager most minor too.

If what you said was true we'd see much more massive projects in C and much less in C++ yet here we are.


I think you might be missing the main point of what this person is saying, which is:

"I'd claim that the less they used C++ and the more used C++ compilers to write a C-like code"

And I bet that's true. I doubt there's layers and layers of template metaprogramming in Ableton Live, or that Maya is a study in inheritance. Game engines, certainly, are C++ written as so-called "better C."

"Yet here we are" -- a place where people reinvent C and call it "data-oriented programming." /s


> I doubt there's layers and layers of template metaprogramming in Ableton Live, or that Maya is a study in inheritance

I doubt both of your assertions - public Ableton code uses templates fairly liberally ( https://github.com/Ableton/link/search?q=template&unscoped_q...) - and due to backtrace exposure I know that this is also the case for private implementation things.

For Maya just look at the plug-in API :

https://help.autodesk.com/view/MAYAUL/2017/ENU/?guid=__files...

https://help.autodesk.com/view/MAYAUL/2017/ENU/?guid=__files...

https://help.autodesk.com/view/MAYAUL/2017/ENU/?guid=__cpp_r...

command pattern, factories, etc... it's textbook OOP.

Also, most projects in the wild nowadays are C++11 or later, with actual use of C++11 language features. e.g. look for example OpenAGE, an AOE2 reimplementation - https://github.com/SFTtech/openage/blob/master/libopenage/jo...

That kind of thing is as far as "C with classes" as is possible.


> most projects in the wild nowadays are C++11 or later

Which means either that their codebase is less than a decade old, or there were serious rewrites involved. Which fits my initial statement: historically, long term (T > 10 years), successful projects should have better used C++ as C-like as possible (or just used C).

I'm not claiming it's not possible to have successful projects using C++ lasting long, just that it would be, long-term (really long term), less overhead to use C for big projects.

Êdit: My personal bias: I was involved in two long-lasting projects, where just dealing with boost dependencies used up more maintenance time than it would have been needed for experienced developers to develop the code which didn't depend on boost. But of course, those who used boost didn't have such an experience to make such calls, and in initial development "look how fast we have a feature x" wins.

I personally consider Google's C++ style guide limiting Boost use as one of the best examples of some influence of some experienced developers on the policies.


> or there were serious rewrites involved

why do you assume that ? if someone was using, say, "Modern C++ Design: Generic Programming and Design Patterns Applied" from Alexandrescu (released in 2001!) as a guideline when writing one's codebase, "migration" to C++11 is maybe replacing some custom types by std ones for stuff like mutexes and threads, etc. but all the architecture would stay pretty similar.

> Which fits my initial statement

I really don't see how - all those projects I listed are successful, you seem to be saying that it is despite of C++ but I don't see any argument towards that. I personally don't know anyone who would willingly go back to writing large-scale projects in C after working in C++.


As the CVE database proves, just because one can do it, doesn't mean one should do it.

Liability can't come soon enough.


That, and I dislike the low-level vs high-level differentiation. It is relative, and more of a spectrum, IMO, and not that it is particularly useful anyway.


> The best thing you can do is finding some open source code that interest you, read it and write it.

Has reading source code in a language that’s unfamiliar to you shown to be of any real benefit to learning? It seems you need at least a little bit of foundational experience with it before your brain can even parse what you’re seeing in a beneficial way.


> Has reading source code in a language that’s unfamiliar to you shown to be of any real benefit to learning?

Absolutely! You start finding idiomatic patterns and "oh, so that's how they do it" kind of things. Find library functions you never knew about but now you do. Find weird things, and look them up in a manual/reference/SO/chatroom. Things that you might find in a book, but a book that covers all the idioms and practice and weird things is gonna be as thick as the bible, and it'll get outdated (it doesn't help that most books are focused on teaching the basics rather than showing off how to architect your application well). Things that you can learn the hard way by why not look and see and learn?

> It seems you need at least a little bit of foundational experience with it before your brain can even parse what you’re seeing in a beneficial way.

Nah, you just need programming experience (in a similar paradigm) in general. A lot of what you learned from languages before will translate.

Also OP mentioned that "I can write C" and "I've started reading 'Modern C'", so it's not like they're looking for their first Hello World snippet.

OP said they've got 15+ years of programming experience. At that point, picking up a new language is all about learning the vocabulary and idioms, plus the few unique or tricky or quirky things that don't show up in other languages (or that aren't obvious from looking at code). The fastest way to get to that vocabulary is to look at real code.


>Has reading source code in a language that’s unfamiliar to you shown to be of any real benefit to learning?

Yes, I think so. The parent has experience with other languages, like python, javascript and go. C has a very similar paradigm. If it were clojure it would be different.

I find exposition to new languages real usage the best way to learn them. I went to the UK to learn english, Germany to learn German, could be painful at first, but works, specially if you know a little about what is all about.

Your brain makes sense of everything by itself in a magical way. That is the way you learn your native language.


I'd tend to agree here also. I didn't start reading source code until a couple years in because I couldn't really put it to use. It becomes 10x more valuable when I was deep into my career learning from the big shots.

It _can_ help no doubt but not sure that's the best way to start. Especially with a language like C. It's not so straight forward without some training or CS education.


> It becomes 10x more valuable when I was deep into my career learning from the big shots.

Yeah well OP said they've 15+ years of experience so..

> Especially with a language like C. It's not so straight forward without some training or CS education.

I agree with pritovido, when they say C is a very simple language. Sure, it has grown its quirks and gotchas along with some cruft (the kind of things you wouldn't learn about in a CS curriculum anyway), but at its core it really is very simple.


For me there's almost no wisdom to assimilate when reading source code until I've given a fair shot at using the language and experiencing its obstacles. Then seeing other people overcome them is where almost all of the a-ha moments are.


You use Lisp to write C (on Microcontrollers)? Would appreciate more info on that.


You could consider this. https://ferret-lang.org/


Not the OP, but as I understand it NASA's JPL has been doing this for decades for space missions.


That a horrific mistake that MIT nitwhits used to teach students at the end of 6.001 - how to write opaque, horrible lisp programs in languages such as "C".


Once you know the basics watch this: https://www.youtube.com/watch?v=443UNeGrFoM

;-)


In addition to the other recommendations, make sure you fully understand the memory model of your target architecture, the stack, the heap, the linkage.

Also, understand the undefined behavior, the compiler optimizations and how it can affect your code if you forget about the UB or if you ignore the compiler warnings, and read about the `volatile` keyword.

And a personal tip, please tell the compiler not to use strict aliasing (I think it's stupid) - you will get rid of some UB without much sacrifice.


> Also, understand the undefined behavior, the compiler optimizations and how it can affect your code if you forget about the UB or if you ignore the compiler warnings, and read about the `volatile` keyword.

PSA: volatile generally doesn't do what you want it to; you almost never want it. volatile does not "fix" undefined behavior.

> And a personal tip, please tell the compiler not to use strict aliasing (I think it's stupid) - you will get rid of some UB without much sacrifice.

This prevents the compiler from performing a number of optimizations.


Just to be clear, heap and stack are not part of C.


Then why does it make a difference whether you dereference a pointer to the stack or the heap at any point in your program?

The hardware representation of an int isn't part of C either but it most certainly had an effect on how your program will run.


Integer type representation is part of C.

See §6.2.6.2 in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

Unsigned integers are binary, while signed integers are allowed one of three well-defined representations.


> Then why does it make a difference whether you dereference a pointer to the stack or the heap at any point in your program?

Ok, I'll bite. What difference does it make?


Because stack pointers are short lived, while heap pointers can be longer lived.

Say you have a queue of pointers that some thread is chewing on, you can't stick a pointer to a stack variable into that queue and then exit the scope of that stack variable, but you could put a longer lived heap pointer into that queue and exit the scope of the pointer, essentially passing ownership of the pointer to whatever's chewing on the other end of the queue.


Stack pointers can live just as long (for example, define some objects in main(), and they will exist for the duration of the program; hand out pointers like candy). An allocated object can be deallocated whenever.

Dereferencing said pointers isn't any different, no matter how the pointee was allocated:

    void myfun(void *foo) {
        // do things with foo.
        // how it was allocated is all the same to me
    }
As far as C is concerned, stack and heap do not exist. The validity of reference to an object is defined in terms of lifetime, which stems from its storage duration (static/automatic/allocated).

And these are concepts that just define the semantics of the language. For example, an object with automatic storage duration might never hit stack or heap or any other part of RAM; it could live in registers, or be optimized out altogether. It could be in the bss segment. The semantics also do not forbid the implementation from using mmap() or malloc() (or similar) for objects with automatic storage duration.

Using stack for automatic variables and heap for allocated variables is just one implementation technique, which should not give rise to any observable differences in the meaning of a legal program (i.e. one that doesn't go into undefined behavior) as interpreted by the abstract machine, which the standard defines.


The C standard does not talk about a heap or the stack; you could implement a "heap" by handing out pointers from the top level stack frame if you wanted to.


Is this important? By far the common case in C is using heap allocation as implemented by your libc or crt. It's pretty essential knowledge.


I guess they're not "exclusive" to C, but they're very important in C in a way they aren't in most languages. So I think op's advice here is good.


Like anything else, set yourself a goal for a project. What kinds? C today is typically used for developing embedded systems, kernel internals, or device drivers.

Depending on how deep you want to go. The most in depth way to understand C is to learn assembly on an ISA first.


If you’re an experienced programmer, with some understanding of C, I recommend: “Expert C Programming: Deep Secrets” by Peter van der Linden.

It’s one of the best books in the world. About anything. Ever.


I was going to post the same, now I don't have to, fabulous book.


> Note, I'd like my C code to work on Windows, which possibly restricts the "right way" to features provided by C89

That might be the case if you plan to use the Microsoft C compiler as they have publicly said the don't have plans to update their C compiler.

However that is not your only option, since clang and several GCC ports will also work on Windows.


Microsoft has started to implement more of C99 over the years. I think it was 2015 where they implemented mixed declaration and code. I noticed they put stdint and stdbool finally in 2010 or so.

Still pretty incomplete last I checked but portability is better than it was.


Those features are based on ISO C++ requirements, regarding the adoption of underlying C features and respective standard library.


Mixed declaration and code is very old in C++, and Microsoft has always supported it there. It wasn't until vs2015 that they let you do it in a .c file.


Sure they did let do it, as long as that C file as compiled as C++.


There is a "C language mode" inside cl.exe (by file extension or by /TC flag) and there's plenty of C that doesn't compile as C++.

eg.:

   char *p = malloc(n);
No explicit casts from void*!

Anyway, around the time they started allowing mixed declarations and code in a .c file, I was noticing on-the-job that this was becoming a disproportionate source of build failures in intended-to-be-portable .c files my colleagues were writing on Unix. It was as if cultural memory of pre-C99 declaration requirements was disappearing specifically around that time. So it makes sense to me that they went ahead and added it. They probably got a lot of complaints.


Are you not able to use their C++ compiler to get modern features? My understanding is that C++ is a superset and that a C++ compiler is also usually a C compiler. Is this view incorrect for modern implementations?


It's not strictly a superset, to the point that there are good reasons to stay with a C compiler other than to simply stay sane. I make a lot of use of designated initializers for laying out static const arrays. And I actually don't like the way that structs in C++ can be referred to both with and without the struct tag. I also don't like how void-pointers are not compatible with other pointers without an explicit cast. Or that plain integers cannot be assigned to enum types. Some of this is just my personal preference (or Stockholm Syndrome), but the fact is that most C code will fail to compile as C++ without some changes.

MSVC is probably pretty close to C99 these days. Designated initializers are C99, for example.


You can assign integers to enum in C++, just like in C, but not to enum class. Use the latter when you need type safety, the former when you need to easily convert integer <-> enum, which apart from (de)serialisation is rare in my experience. And even in those cases, I personally prefer the explicit casts.

As far as I know, MSVC just implements the C subset of C++.


What I mean is this:

    > enum Foo { A, B, C };
    > Foo x = 1;
    test.cc:2:9: error: invalid conversion from ‘int’ to ‘Foo’ [-fpermissive]
     Foo x = 1;
I prefer not to use explicit casts because 1) using enum types is confusing to me 2) I frequently use values that are not in the enum - most often that's -1 as a way to indicate "missing". 3) I also like to iterate over all values from an enum using a plain for (int i = 0; i < NUM_FOOS; i++) without any fuss.

> As far as I know, MSVC just implements the C subset of C++.

There are things that work in MSVC C mode that do not work in C++ mode.


C++ is not a strict superset of C anymore. You can't compile all C in a C++ compiler.


has c++ ever been a strict superset of c?

afaik,

  int* arr = malloc(sizeof(int));
has never worked in c++.


It's probably never been a strict superset really.


To use a trivial example:

    int free;


care to explain which language this doesn't work in (and why)? I'm curious. AFAIK, `free` isn't a keyword in either language and this is just declaring an int variable with a legal identifier (but not initializing it).


This is actually pretty hard. I recommend reading C codebases, but it can be hard to know which ones are good. Code to C99 for now; keep an eye on C11 etc. Have a personal library of C implementations of things. Don't worry about being on Windows, although if you're doing Windows development C#/.NET is the lowest friction path.

I'm writing a book on C. I don't think it will ever be finished or see publication. I recommend writing a book yourself - it's a great way to learn, and maybe you're a better writer than me and will publish it.


It's hard to beat this list of quality books:

http://iso-9899.info/wiki/Books#Learning_C

Read some and do the exercises. You'll learn to write portable code to a standard rather than stuff that "accidentally works." There's a lot of crap C code in the world, full of GNUisms, overzealous macros, header-only nonsense abusing translation units, and copypasta. Don't pick up bad habits from that stuff.


Learn (a little bit of) assembly first. It will make more sense IMO if you came at it from below rather than above, so to speak.


I would recommend "Learn C the Hard Way" by Zed Shaw to get you started for something practical, because it goes over how C is written in the real world. Of course, situations will vary. The original book about C by K&R is an excellent book too.


Learn C the Hard Way is considered by the community to be, not exaggerating, one of the worst C books ever written.


Zed makes it clear in that book that he hates C and doesn't use it anymore due to its undefined behaviour.


Zed isn't actually very good at understanding undefined behavior and has demonstrated this loudly and repeatedly. I would steer clear of his book, personally.


> Zed makes it clear in that book that he hates C and doesn't use it anymore due to its undefined behaviour.

Really? Which part in the book specifically if you can cite the passage.


Smart man!


There is no clear source of truth for C. Depending on what you want to do there is quite different approaches. C for embedded systems tends to look quite different from C for systems programming. For instance MISRA is a C coding standard for embedded devices in Motor Vehicles (but many of its ideas are ideas from writing robust embedded systems), it says don't use things like dynamic memory. You'd never use that if you are writing software for linux/windows. So learning C "properly" would be about learning the different approaches people use in C to write robust code and what tradeoffs they are making. Understanding the C standard and the variations. Understanding bare metal programming. Understanding the effect of different underlying processors and hardware architectures. Understanding systems programming and how to consume APIs and how to create reusable binary modules. Understanding the compiler/build system the intermediate files / code generation / memory layout. Read a variety of books ( including K&R ) and study code ( Luas implementation is some very nicely written C )


I liked "C Programming: A Modern Approach"[1]. It has been several years but as I recall it was well suited for self study and was pretty explicit in calling out places it was talking about C99 as opposed to C89.

[1]: https://www.amazon.com/C-Programming-Modern-Approach-2nd/dp/...


You might also take a look at this Minecraft clone in C[1]. Uses sqlite to persist state so you can see how you might interact with a database as well. Quite modern and readable in that most functions outside main.c are <20 lines long. It's also cross platform and should work on Windows, Mac and Linux.

[1]: https://github.com/fogleman/Craft



omg no. Stop obsessing about this dragon. I'm not sure I've ever met it in my life (you know, other than segfaults from NULL dereferences which are UB too).


How much C have you written, and how much has gone through a high-performance optimizing compiler?


I don't know. Quite a bit - maybe written 300K lines and deleted > 200K lines. Most of it was compiled by gcc -O2 and/or MSVC. I've never relied on vectorization or used a lot of intrinsics, though.


I'm honestly quite surprised you've never run into surprising behavior caused by the optimizer handling undefined behavior…


Not that I'm aware of it.


The Visual Studio C compiler (not the C++ compiler) supports "almost all" of C99, I would definitely recommend skipping C89, and use at least the new initialization features from C99 because they can reduce bugs caused by uninitialized data. Also, C++ compilers support a subset C99 that's somewhere around "C95" and is already a much friendlier version compared to C89.

TBH, I think the best way to learn C is to tinker a bit with assembly first (it doesn't have to be a modern assembly dialect, just write a few 6502 or Z80 assembly programs in a home computer emulator). Coming from "below" instead of "above" C's abstraction level helps understanding the design rationale behind C, and the trade-offs between flexibility and unsafe memory handling.


I learned a few basic levels of C using my subscription to pluralsight.com.

But... I quickly switched to Rust as I see it being the C of the future as it continues to develop. That's just me though. (I have similar opinions towards things like Kotlin or Go.)

The books I am reading in the comments below are great resources.


With C++, going modern-first is a great approach. But C hasn't really been modernized that much, and because Microsoft has been so slow to fully support even just C99, let alone C11 or any GCC/clang extensions, often one has to to stick to a subset of C99 anyways -- exactly as you surmised.

The problem with C is that it's extremely dangerous. Besides learning how to program in C, you'll really want to know how to use memory debuggers such as valgrind and Dr. Memory (https://www.drmemory.org/), as well as AFL for fuzzing, and various GCC and clang sanitizers. You'll really want to let functional programming language experience inform your approach to C API design and implementation.


Microsoft hasn't been slow, they have been quite clear, C is legacy on Windows, given its security implications, all new code should be C++ and the compiler is called Visual C++ after all.

C support is done to the extent of ISO C++ requirements.

For those that still want to use plain old C on Windows, they have contributed to clang, and it is now available as part of Visual Studio installer.


The most important difference between a higher level language and C is that you really have to understand a bit about the underlying hardware architecture to do it "right". C is a fairly thin veneer over assembly and it helps to have that mental model in mind when working in it.

It's a very small language with almost no batteries included, built very much around the manipulation of pointers and memory more than just about anything else.

To do C the "right" way means approaching problems from that perspective and building mountains out of toothpicks while thinking about how the billions of toothpicks need to interact with one another so they don't crash/catch fire/etc.


I'd just type in the examples from K+R, try them out, and modify them to do new things.


Do this but get the answer book too, because it's easy to get stuck in here.


K+R is pretty ancient.


That doesn't make it worthless; it's still probably the best book to learn C today if supplemented correctly.


So is C :-)


Becoming proficient in a language usually involves: - writing code with it long enough, - collaborate in a project based on it, - reading code other people wrote, - keep yourself up to date by following the language evolution, - go more in depth in language internals, compilers, etc...

I wrote C for 10+ years (mostly bare metal FW), yet I am still amazed of how little I know about it. Recently for example I learnt of all the things the dynamic linker does in Linux, how symbols memory addresses are resolved at runtime using PLT, ....

The good point about C is that it can be used for very different kind of projects, so the choices are a lot.


This was a pure satisfaction when I have first seen it :)

https://en.wikipedia.org/wiki/Duff%27s_device


(Please don't write code like this anymore)


Sure, duff device is obsolete due to cpu zero overhead looping but it is still nice to see, mind boggling piece of code. :)


(Please do write code like this anymore.)


The C FAQ used to be a good place to start. Note, used to be:

http://c-faq.com/

What did you want to do on Windows? I think this depends as much on the compiler as the code.


Second this. Read it all more than once; the insights it will give you into C are so valuable.

I'll also recommend C++ FQAs [1]. It's C++, yeah, but it's relevant to C in many places, and since a lot of systems programming now uses C++ as the lingua franca, it's a good, entertaining resource.

1: http://yosefk.com/c++fqa/


I would read K&R, it’s a right of passage. Look at some of the larger open source projects for how to organize and manage a large c project.

And practice pointers. Get really comfortable with how memory works.


I’m currently learning C myself and something that I realized is that often the hard part to learn is not C the language but the domain in which is being used.

For example, I’m trying to learn Linux drivers or embedded programming and I thought that I’d find in a C book information about registers and such, but registers don’t really belong to the C domain but to the hardware domain.

Thus, maybe find a way to clarify what you want to use C for and then learn those domain problems and how C solves them.


Programming in C requires you to take proper responsibility for what you write, but also allows for great power in return.

I've been working on a reasonably large (and cross platform) project in C (https://domeengine.com) for a couple years now, and I find that C generally requires a certain kind of discipline. When you interact with an API, you need to read the docs, figure out their specific preconditions, and requirements on handling frees, establishing what you own and what you don't.

It also helps to make sure that if you allocate something, you clean up after in a clear and well defined way.

You should also get very familiar with debuggers (LLDB/GDB) and other dev tools, to help when you have made a mistake.

In the modern era, you have a couple of options for getting it to run on Windows. C89 using Visual Studio, The MSYS2/MinGW2 gcc toolchain, or a gcc toolchain using the WSL. I use the MSYS2 tools for DOME because it requires the fewest platform-specific modifications to my code, but this varies by usecase.


Pretty obvious answer, but for those who are new to C, the C Programming Language by Brian Kernighan and Dennis Ritchie is a good place to start


Best paired with a bunch of modern resources about how undefined behavior works.


An important question to ask yourself is why do you want to learn C? If you want to be able to read large C codebases makes sense. But if you want to write embedded software or systems software, rust or c++11 (ugh) are arguably a better place to start. I say this as someone who has learned C but moved on to c++ without regrets. Though I have serious rust envy :)


What is stopping you from getting into Rust?


My current work projects require c++ unfortunately. Some day. Sigh


while not a definitive guide, https://matt.sh/howto-c is a good example of how to approach "modern" c.

c99 is fully implemented in windows with msvc (2015), gcc, clang, and intel's compiler, so "right way" should not need to involve c89. most are c11 compliant as well.


If you're into visual/interactive learning, please give Harvard's CS50 a try.

https://www.youtube.com/channel/UCcabW7890RKJzL968QWEykA https://cs50.harvard.edu


First step is making sure you're using the flags

`-Wall -Werror -pedantic` and then one of `-std=c89` or `-std=c99`

(Or equivalent in whatever Windows C compiler)


Why would you suggest older versions?


I've written many tens of thousands of lines of C, but retired 15 years ago. The two best books I know are:

1. C: A Reference Manual, by Harbison and Steele. https://www.amazon.com/Reference-Manual-Samuel-P-Harbison/dp...

2. The C Puzzle Book, by Alan Fueur. https://www.amazon.com/Puzzle-Book-Alan-R-Feuer/dp/020160461...

Harbison and Steele has much better explanations than K&R. The Fueur book taught me a lot about C declarations. Declarations are that part of C language that is them most unnecessarily difficult.

You asked about a slightly different question, best practices. But in the real world you'll run into a lot of code that practices below that level.


I would focus on sources that use C11 or C18, since there are some niceties in the newer standards.

As a reference, I like QEMU's source code[1]. It's huge, but the style and practices found in any file will help you get a grip on good C.

[1] https://github.com/qemu/qemu


This is a fun book to start with "Expert C Programming Pete van Linden" https://www.amazon.com/Expert-Programming-Peter-van-Linden/d...


In Modern C (book mentioned by OP), there is takeaway 2.11.1.14 which says "Don't use NULL" because:

The definition in the C standard of a possible expansion of the macro NULL is quite loose; it just has to be a null pointer constant. Therefore, a C compiler could choose any of the following for it: 0U, 0, '\0', 0UL, 0L, 0ULL, 0LL, or (void * )0. It is important that the type behind NULL is not prescribed by the C standard. Often, people use it to emphasize that they are talking about a pointer constant, which it simply isn’t on many platforms. Using NULL in a context that we have not mastered completely is even dangerous. This will in particular appear in the context of functions with a variable number of arguments.

NULL hides more than it clarifies. Either use 0 or, if you really want to emphasize that the value is a pointer, use the magic token sequence (void * )0 directly.

https://www.gnu.org/software/libc/manual/html_node/Null-Poin... says:

The preferred way to write a null pointer constant is with NULL. You can also use 0 or (void * )0 as a null pointer constant, but using NULL is cleaner because it makes the purpose of the constant more evident.

If you use the null pointer constant as a function argument, then for complete portability you should make sure that the function has a prototype declaration. Otherwise, if the target machine has two different pointer representations, the compiler won’t know which representation to use for that argument. You can avoid the problem by explicitly casting the constant to the proper pointer type, but we recommend instead adding a prototype for the function you are calling.

https://man.openbsd.org/style says:

NULL is the preferred null pointer constant. Use NULL instead of (type * )0 or (type * )NULL in all cases except for arguments to variadic functions where the compiler does not know the type.

---

Readers: what is your take on it and why?


> Readers: what is your take on it and why?

It's ok to draw a line on what implementations you care about. And honestly, at this day and age, I couldn't care less about implementations that define NULL to be 0 (or something else that isn't a pointer). It's a case of where it's not worth everyone's time to be catering for the lowest common denominator.. instead, fix that implementation or just go download a better one.

If I could make breaking changes to C today, I would remove the NULL macro and replace with a lowercase null (or nil or whatever) keyword a definition that doesn't impose pointless burden on its users.

That said, I'm still very much in the habit of casting NULL to void* when I'm using it as an argument for variadic functions. It's silly, but that's just how we do things...


You could make a proposal to the C committee to add a keyword `_Nullptr` and to add a header `stdnull.h` which has `#define nullptr _Nullptr`, like they did with bool.


Yep. I'd be happier to work with the C committee if they were a little more open. For example, their mailing list could be public.


> Readers: what is your take on it and why?

All the bad stuff that people on both sides of this argument predict is going to happen somewhere in some code base, precisely because people on both sides can't agree and think the world would be better if everyone agreed on those people's personal aesthetic ideals.

So... don't sweat it. Conform to whatever code base you're working on has decided, and if it's inconsistent work to make it so.

And above all, no matter what else: Don't start dumb fights over this nonsense. Yeesh.


Use NULL, this is what it's for. 0 will convert the the NULL pointer representation, but the NULL pointer representation is not required to be bitwise identical to 0.


I am currently relearning C, I tried different books over the years (to not say decades), never did more than a toy project. The book I picked up this time was extreme c (from Amazon reviews, failry new book), by Kamran Amini, and so far (few chapters in) it's fairly enjoyable as it takes C, compilation process etc from the "basics", fairly hands-on and not too much theorical and relatively fast paced. I feel that the book is more about how to make a program C and not teaching literally the language itself. So really what I needed.


I think you can use any version of Visual studio to write C programs. I recently did a video tutorial using Visual Studio: https://www.youtube.com/watch?v=JaDzgmZvY00 on the "C++ Modulus Operator" https://www.mycplus.com/tutorials/cplusplus-programming-tuto...


Essential parts of C language are actually really a small set.

What I'd suggest is, learn syntax and try to understand linux kernel (which is one of the very well crafted software piece we have written in C). If you don't want to go that much of a deep dive, you can check sqlite source code as well. Writing code starts with reading it. Do yourself a favour and spend more time on reading code than reading books.


Perhaps better than a book, can anyone authoritatively point to a few small-and-readable but best-in-class open-source C projects to use as a reference?


* tweetnacl.cr.yp.to

Crypto library in the size of a hundred tweets [ https://twitter.com/tweetnacl ] by the only person with a genuine claim to being able to write safe C & company.

* https://github.com/mit-pdos/xv6-public

&

* https://github.com/mit-pdos/xv6-riscv

UNIX v6 clone in ANSI C by influential Plan 9-era Bell Laboratories employee and now influential Google employee Russ Cox, along with influential computer virus author and son of one of the original UNIX authors Robert T. Morrison; entire source code fits in under a hundred pages of well-typeset documents [ warning, old copy, you should generate a modern one: https://pdos.csail.mit.edu/6.828/2011/xv6/xv6-rev6.pdf ].


not recent projects but everything that DJB implemented is absolutely elegant, low on bugs and great as a lesson for how to write secure C. e.g. qmail, djbdns, daemontools, there is a lot of ideas there you can learn from: https://cr.yp.to/

It also helps to build up (and refactor) your toolset over time, memory handling wrappers, logging, I/O, daemonize (https://github.com/jirihnidek/daemon) etc, so that you don't have to keep reinventing the wheel.

if I'd recommend one book then it's: https://en.wikipedia.org/wiki/Advanced_Programming_in_the_Un...


Any C based project of zeromq project like https://github.com/zeromq/czmq is worth to read. Pieter Hintjens started a book explaining all the design decisions there https://hintjens.gitbooks.io/scalable-c/ Sadly he didn't have a time to finish the book.


Lua and Sqlite sources are excellent.


Linux as well. The comment culture of C devs really help with understanding all of it.


There was a book featured in a HN post about a year ago - an older Linux source with extra commentary, used by students in China, I think (book is in English)



Sqlite


All the advice already here is very good and implementable.

Here are my two cents.

I have been programming in C professionally for the past 12 years, and I think the way to go would be to go implement a well known tool in C and then compare your code with the open source code of that tool. e.g. Git.

You will learn a lot from how the feature that you chose to implement in a certain way was implemented in a completely different way for various reasons.


It's not easy. The obvious answer is read and write lots of code. Experiment. Make sure you actually know how a computer works. Learning some assembly even if you don't ever use it can help with that (free ebook "Programming from the Ground up" can be done in a weekend) Read existing code bases, different styles, like linux or some GNOME stuff.


Finding "the right way" is closer to religion than it is to problem solving and science.

You will always be vulnerable to criticism and other people's opinions, and you need to open up and acknowledge this fact and work with that. There are best practices and principles you can adhere to. Stand up for your design decisions, and refactor when you have to.


I'd say there's no proper way. Every project has its way of writing C. With the use of macros, sometimes it looks a bit like a DSL for every specific project.

My advice: think about what kind of software you want to write and look for similar projects, libraries, or contribute with a new module, functionality, adapt something to your needs...


I would advice AVOID C macros at all cost. They are so weak and hard to debug and problematic.

I write in several languages and read lots of C projects made by others and find the affirmation hard to believe: C is so simple that is very easy to read if the writer of code has a minimum of competency.

You can say that of projects like C++, specially things like C++11 that are languages by committee so complex that you can affirm that every C++ programmer is different or find programmers that could not understand each coder's code. Not so with C.

That happens to me with C++ and lisp code. It takes a while for me to understand what the author uses(or abuses) before being able to understand the code. That does not happen with C code.


The Linux kernel uses lots of macros, many of them with lowercase names that are barely distinguishable from function names. Given the quality of that codebase, I think not using macros is a bit silly.


Get code reviews from someone that actually knows C. Mentorship and harsh code reviews are the best way to learn.


I use C since 1992, but I never confident in my C code, unless it heavy with macros to check error codes and return early, register memory allocations and free memory at return, and so on. Rust improved my confidence in C code a lot, but it looks more like Rust code now.


In my opinion, one of the main reasons C looks old and not practical to beginners is the standard libs API.

But it is perfectly possible to use C as a language with a more modern and easier to read API, but you'll probably have to build your own.


There are other compilers for windows than MSVC++, so you can move beyond C89 if you wish.

stdint.h is useful and should be required reading. I still come across too many C projects that reimplement it poorly.

Learn to use valgrind, and maybe ddd too.


You can compile with clang on Windows, so you can use later standards.


The main thing that C provides over other languages is direct hardware access. So study the hardware, find out what you want to do with it, then use C to implement it.


Besides all that is mentioned already to read or do. I would suggest to try writing something you coded in python or go previously in C.


Read John Regehr's blog, eg https://blog.regehr.org/archives/213

Read "Who Says C is simple?": http://cil-project.github.io/cil/doc/html/cil/

Take up a different language :-)


To start with, "The C Programming Langauge", Kernighan & Richie (K & R) [1]. Find the latest edition that you can buy.

I think the primary topic to master in C is pointers. This is where most falter. It takes a few years to "master" (if we ever do). Here I would recommend "Understanding and Using C Pointers", Richard Reese. [2]

If you are interested in networking, any of the classic "TCP/IP Illustrated Vols I/II/III", W. Richard Stevens, [3] contain a ton of C code to implement components of TCP/IP.

If you are interested in Graphics, then "Graphics Gems", Andrew Glassner [4] is a good source.

"An Introduction to GCC", Brian Gough, [5] to understand the tooling and its various bells and whistles.

My learning swimming by jumping into the deep end of the pool experience was realized by learning Windows Programming using the Charles Petzold book and navigating through Microsoft Foundation Classes in the late 80s/early 90s. The state of the art in tooling wasn't that great in those days and I spent months with the book to get things going. This was done after I had built a foundation with K&R and a decent amount of Unix network programming.

I see a lot of the other posts recommend more modern books. But you still need to build your foundation on C and Pointers in particular.

Good luck on your journey.

[1] https://www.amazon.com/Programming-Language-2nd-Brian-Kernig...

[2] https://www.amazon.com/Understanding-Using-Pointers-Techniqu...

[3] https://www.amazon.com/TCP-Illustrated-Protocols-Addison-Wes...

[4] https://www.amazon.com/Graphics-Gems-Andrew-S-Glassner/dp/01...

[5] https://www.amazon.com/Introduction-GCC-GNU-Compilers/dp/095...


I am in the same boat as you are...bare with me by no means I am an expert.

In fact, I read couple of chapters in Modern C yesterday :). Here are some of the things I am doing to improve my C skills to match with some of the admired/professional developers.

Decide which platform to use

~~~~~~~~~~~~~~~~~~

Unfortunately, to become proficient in it we need to write code and focus on a platform. I have been fighting between whether to develop on Windows vs Linux. I am very experienced in Windows environment(using debuggers/cl/linkers/Windbg etc) but when it comes to writing good quality C code(not C++) and for learning how to write good maintainable moderately large source code, my research showed that Windows compilers/C standard APIs are not great, in fact they hinder your productivity. I have wasted countless number of hours to just figure out how to properly create a simple C project with a decent build system. Unfortunately, I could not find one. The closest I could find is CMake as MSBuild is a nightmare to work with. I even tried NMAKE but failed. When it comes to documentation of finding basic C Api usage, MSDN(https://docs.microsoft.com/en-us/cpp/c-runtime-library/run-t...) does a decent job. But in the name of security you will find zillion variations(_s, _l) of an API and by default msvc compiler will not let you use some of the API in their simple forms. Instead, you have to define _CRT_SECURE_NO_WARNINGS etc. I think for someone just getting started to develop/learn to write a decent code base in C these restrictions really hinder the productivity. So finally, I have decided to instead focus my learning on Linux platform(currently through WSL - Windows subsystem for Linux) with its POSIX apis. You know what, `man 3 printf` or `man 3 strlen` is soooooo much better than googling msdn

Mastering C

~~~~~~~

I think, the simple and straight answer here is reading good code and writing "good" code and also reading good C content(be it books or articles). I think these are the three ingredients necessary to get started. Of all the open source projects that I have investigated, I found Linux Kernel and related projects seems to have very good taste in terms of code quality. Currently, I am just focused how they use the language rather than what they actually do in the project. Things like, how they structure the project, how they name things, how they use types, how they create structures, how they pass structures to function, how they use light weight object based constructs, how they handle errors in function(for example forward only goto exits), how they use signed/unsigned variables etc(more of my learnings to the end), how they use their own data structures. I think its good to initially focus/target on ANSI C API with C99 instead of heavily relying on the OS specific API on which ever platform you choose. For example, such projects could be writing binary file parsers for example projects like .ISO file format etc.

Good C projects/articles

~~~~~~~~~~~~~~~

1. Winlib.net - https://github.com/jcpowermac/wimlib is a great source of information

2. e2fsprogs - https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/

3. MUSL - https://git.musl-libc.org/cgit/musl/tree/

4. General C Coding Style - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

4. https://nullprogram.com/tags/c/ - great source of C knowledge

5. CCAN - https://github.com/rustyrussell/ccan/tree/master/ccan - great source of C tidbits from none other than Rusty Russell - I haven't read all of them

6. POSIX 2018 standard - https://pubs.opengroup.org/onlinepubs/9699919799.2018edition...

continued in the comment....


My Learnings(know your language/know your complier/know your tools)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. _t suffix is the notion used to denote a typedef for a given type /* This example is from do_journal.c in e2fsprogs / struct journal_transaction_s { ... blk64_t start, end; ... }; typedef struct journal_transaction_s journal_transaction_t;

2. Know about headers like stddef.h and stdint.h and when they are supposed to be used. for example: When to use normal data types like int vs int16_t etc.

3. From https://en.cppreference.com/w/c/types/integer we can sense that int_t are of exact width types which might have some perf side effects if the underlying hardware does not support the width natively.

    For example, in visual studio x86/x64 we have typedef short int16_t; and
    typedef int int32_t; int_fast_t* on the other hand make sure a suitable
    width which maps natively to the available hardware type For example, in
    visual studio x86/x64 typedef int int_fast16_t; instead of typedef short
    int_fast16_t; size_t on the other hand alias to the natural unsigned word
    length of the hardware for example on x86 typedef unsigned int size_t; and
    on x64 typedef unsigned __int64 size_t;
4. Know your compiler predefined standard macros On Microsoft compiler _WIN64 - defined when we are compiling for x64 code _WIN32 - defined when both x86 and x64 code is getting compiled _MSC_VER - defines which compiler version are we using, indicate different visual studio versions __cplusplus - defined when the translation unit is compiled as C++

5. We can get a FILE* from HANDLE using below APIs from io.h and fcntl.h Fd = _open_osfhandle((intptr_t)Handle, _O_TEXT); File = _wfdopen(Fd, L"r"); Once we get the FILE* we can use fgets for line oriented string operations

6. Learned about var args and aligned memory Aligned memory means the address returned by the _aligned_malloc is always divisible by the alignment we specify. For example: char p = _aligned_malloc(10, 4); the address return in p will be always be divisible by 4. We should also free the allocated memory using _aligned_free(p)

7. atoi(str) this api also processes input string until it can convert. For example atoi("123asda") will still give 123 as the return result. Any whitespace in the beginning of the input string will be ignored. So atoi(" 123asd") will still return 123. It is recommended to use strto functions to convert strings to int/long/float types as they also can return pointer to the character which is a non integer

8. UCRT support around 40 POSIX system level APIs but most of these have _ prefix to them. wimlib in wimlib_tchar.h defines #define topen _open for Win32 and #define topen open for POSIX systems The take away here is the UCRT implementation even though differ in name the parameters are exactly the same.

    For example:
      UCRT Win32: int _open(const char *filename, int oflag, int pmode);
      POSIX:      int  open(const char *pathname, int flags, mode_t mode);
9. We can install only build tools(VC compiler) excluding IDE from https://aka.ms/buildtools

10. Best video on C Standard and some of its less known features - "New" Features in C - Dan Saks Year C Standard Comments 1983 C standard committee is formed 1989 C89 C89 US standard 1990 C90 C89 International Standard 1999 C99 C99 Standard 2011 C11 C11 Standard 2018 C18 C18 Bugfix release

    _reserved - Reserved for global scope. But we can use any identifier with an
    _ as a local variable or a structure member

    __reserved - Always reserved. Meaning the user program should not use any
    variable with two underscores __

    _Reserved - Always reserved. Meaning the user program should not use any
    variable with an underscore and capital letter.

    This is the reason why _Bool is named that way to prevent breaking existing
    bool typedef used in existing code.
11. Another good video on lesser known C features - Choosing the Right Integer Types in C and C++ - Dan Saks - code::dive 2018

    we can use CHAR_BIT from limits.h instead of 8 for example when you want to
    print the bits in a integer, we can do below `for (size_t i = sizeof(int) *
    CHAR_BIT; i >= 0; i--) {...}`
12. size_t denotes the native architecture supported natural word size. So for 32bit it is 4 bytes unsigned quantity and for 64bit it is 8 bytes unsigned quantity. Hence it is defined as follows

    #ifdef _WIN64
        typedef unsigned __int64 size_t;   //8 bytes on x64
    #else
        typedef unsigned int     size_t;   //4 bytes on x86
    #endif

    where as uintmax_t denotes the maximum integer type that is available in the
    language. So on a 32bit you could still represent a 64 bit quantity using
    long long even though it not what the architecture directly maps to. So
    below is how it is defined in both x86 and x64

    typedef unsigned long long uintmax_t;  //in MSVC both x86 and x64 support 64
    bit quantities using long long. So size_t does not give us the maximum
    unsigned integer, instead it gives us the native unsigned integer i.e., on
    x86 it will be 32bits and on x64 it is 64bits. So recommendation is to use
    size_t where ever possible instead of using int. for example.

    int len = strlen(str); // not recommended because on both x86 and x64 of MSVC int is 4 bytes due to LLP64
    size_t len = strlen(str); // recommended because size_t will automatically maps to 4 bytes in x86 and 8 bytes in x64
13. C11 introduced the concept of static asserts. These are basically conditional asserts which can be evaluated during compile time. So C11 has a new keyword called _Static_assert(expr, message) The reason for this ugly name is the same idea of not to break existing code. so for convenience assert.h header provides static_assert macro which mean the same. One of the use of static asserts is below

    struct book {
      int pages;
      char author[10];
      float price;
    };

    static_assert(sizeof(struct book) == sizeof(int) + 10 * sizeof(char) + sizeof(float),
                  "structure contains padding holes!");
14. Another good video on some low level details - Storage Duration and Linkage in C and C++ - Dan Saks

15. #define _CRT_SECURE_NO_WARNINGS can be used to disable CRT warning for common functions.

16. Any ucrt function which begins with _ is a non standard api provided by ucrt. For example in string.h's _strdup, _strlwr, _strrev are some. The take away here is, it is easy to identify which function is part of C standard and which are not. Interestingly some(not all) of these non standard functions are part of posix so in glibc(which implements posix) don't have _ in them.

17. All posix function in posix standard with [CX] annotation indicate Extension to the ISO C standard for example, below function from stdlib.h is posix extension. UCRT defines a similar api called _putenv, since this is not part of C standard, UCRT version has an _

    stdlib.h - posix
    [CX] int setenv(const char *, const char *, int);
    stdlib.h - ucrt
    int _putenv( const char *envstring );

    stdio.h - posix
    [CX] int fileno(FILE *);
    stdio.h - ucrt
    int _fileno( FILE *stream );
18. Learned about CGold: The Hitchhiker’s Guide to the CMake. An awesome tutorial about CMake. Now it is super easy to start a C project without worrying about the individual build systems.

    # CMakeLists.txt - minimum content
    cmake_minimum_required(VERSION 3.4)
    project(command_line_parser)
    add_executable(command_line_parser main.c)

    # commands to run to generate the respective native build files like vcxproj files
    # In below command -S standards for source directory path.
    # -B stands for final directory where vcxproj files are generated
    # CMake only generate one flavor (x64/x86) per project file, here we are generating x64 by specifying x64
    cmake -S . -B builds -G "Visual Studio 16 2019" -A x64
    # we can also use cmake-gui to do the above

    # Once vcxproj files are generated we can either directly build the proj files using Visual Studio
    # or better use cmake itself to build it for us from CMD using msbuild
    cmake --build builds


Hope these help.


> _t suffix is the notion used to denote a typedef for a given type /* This example is from do_journal.c in e2fsprogs / struct journal_transaction_s { ... blk64_t start, end; ... }; typedef struct journal_transaction_s journal_transaction_t;

This is undefined behavior in POSIX.


And

    unsigned total(unsigned a, unsigned b) { return a + b; }
is undefined behavior in C.


Speaking of the _t suffix. I think it's being largely abused: we say 'size_t' (instead of 'size') because it is not obvious whether 'size' is a type or not; on the other hand, in the case of, say, 'int32_t' it is clearly redundant and therefore has always looked kinda silly to me.


Radical suggestion: read the C standard.


Asking someone to read a multi-hundred page standards document just to learn the language is wholly unreasonable.


Well, OP asked for source of truth. Nothing comes as close as the standard itself.

Honestly, I think any programmer writing C today should have it around for reference. And yeah you kinda need to read it too, or you won't be able to refer much.

That said, no need to read it cover to cover. There's stuff one can earmark as being there but ignore until it's actually needed (for example: the grammar and all the library functions).

And speaking of tedu, I would recommend to the OP that they get in the habit of checking out the OpenBSD man pages for libc functions.


> Well, OP asked for source of truth. Nothing comes as close as the standard itself.

An unfortunate reality is that standards are not always strictly followed (whether intentionally or not). C is almost certainly better about following the standards than other pieces of tech (notably web browsers), but I doubt the C compilers are perfectly compliant if you look hard enough.


I actually have very few $200 PDFs lying around.

Do you really go purchase the standard documents from ISO? Or does your employer? I'd love to have the actual standard "around for reference," but 700-page technical documents from ISO are not cheap.


The next best thing is the latest draft standards, which are available online for free (and I always have them on my laptop, PC, and personal server). Differences between them and final official standard are not important. Search for N1256, N1570. Or grab them here. http://www.open-std.org/jtc1/sc22/wg14/www/standards

(There are more readable html versions floating about)


Screw C, really. Let this crappy old language die off quietly. Learn Zig (the best replacement for C) or at least Rust.


I'd check out Zed Shaw's book [0]. It's opinionated, but I think that's a good thing for something like C.

It's broken up into exercises that you start working through straight away, and you start early with valgrind.

[0] https://learncodethehardway.org/c/


The author of that book doesn't really like C, and it shows. And valgrind is usually not the first tool you should be using to debug memory issues–try Address Sanitizer instead.


C is great for learning about how computers work at a low level. In my college, we started by writing a simple compiler. This should force you to understand pointers and memory which, as others have mentioned, are fundamental. You'll have to both know how the assembly works and to write correct C code. So writing a simple compiler will force you to understand it two different ways at once. My opinion is you'll learn more by diving in and doing things, especially with your existing programming background.

But, for your own sanity and everyone else's, do not start new projects in C! (Aside from purely academic ones for you to learn.) The point of a programming language is to help humans write safe and correct code. In this sense, C has completely failed. The syntax itself is just begging you to make a subtle mistake. And due to lack of memory safety, even the most well tested and scrutinized projects in C/C++ (such as Chromium) suffer from buffer overflows or memory leaks, potential security vulnerabilities that are automatically prevented by design in other languages. If you need to do something low level with any sort of confidence, use a memory-safe language like Rust, which can even do interop with existing C libraries and code.

(Edit: typo)


> The syntax itself is just begging you to make a subtle mistake.

C's syntax is fine (if a bit ugly around declarations), it's really the memory safety that gets you.


It depends on what you're doing. Low level stuff pretty much has to be C. But for applications I agree there are much better alternatives these days.


Yeah, there may be some situations where C is necessary as a wrapper for assembly if you're _extremely_ memory constrained or something like that. But I wouldn't have any confidence in it working as intended unless it's dead simple.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: