Hacker News new | past | comments | ask | show | jobs | submit login
Principles for C programming (drewdevault.com)
240 points by ddevault on March 16, 2017 | hide | past | favorite | 147 comments



Avoid magic. Do not use macros.

Disagree. Use magic, especially macros, in ways such that your code becomes easier, not harder, to understand.

A few examples from my own code:

1. My "elastic arrays" (https://github.com/Tarsnap/libcperciva/blob/master/datastruc... and .c) allow me to write

    ELASTICARRAY_DECL(STRLIST, strlist, const char *);
and get a data structure STRLIST which contains an arbitrary number of strings and functions strlist_init, strlist_append, strlist_get, strlist_free, etc. for accessing the array. Compared to the non-macro approach of keeping track of the array size and resizing as needed, this makes code vastly simpler. (Of course, this sort of data structure is built into most non-C languages already.)

2. My "magic getopt" (https://github.com/Tarsnap/libcperciva/commit/53d00e5bd0478f...) allows me to something which looks and behaves just like a standard UNIX getopt loop, except with support for --long options. Yes, the implementation is mildly insane (and needs to work around a bug in clang!), but it allows for code which is vastly simpler than other getopt-with-long-options alternatives.

3. My "cpu features support" framework (https://github.com/Tarsnap/libcperciva/blob/master/cpusuppor...) makes use of both macros and some tricky edge cases of C object linkage rules, but makes it trivial for me to add support for new CPU features.

4. Soon to be released, the PARSENUM macro (WIP: https://github.com/Tarsnap/libcperciva/blob/parsenum-additio...) which allows me to write

    PARSENUM(&n, "1234");
    PARSENUM(&x, "123.456");
    PARSENUM(&s, "123", 0, 1000);
where the first argument is a pointer to a variable of any integer or floating-point type to which is assigned the numeric value of the string in the second argument; for floating-point values and unsigned integers, the two-argument form range-checks the value against the bounds of the type, while the four-argument form range-checks against the provided bounds. (Basically, this is strtonum on steroids.)

In all of these cases, you will never need to understand how these macros work. Instead, you can simply treat them as language extensions which allow you to write cleaner and simpler code.


>1

I'm conflicted about this one. I choose not to try and emulate generics with macros because adding language features with macros is a terrible idea. On the other hand, I recognize the problem with void*. It's a matter for debate but I definitely fall on the "don't use macros for this" side.

>2

This is a case where I would rather write more code than rely on magic. You have to have this awful hacky opaque implementation in exchange for a trivial improvement in ergonimics. No thanks.

>3

I mean, just look at this code. Or better yet, have someone else look at it. This is totally unreadable and unmaintainable, all for a marginal ergnomics improvement.

>4

Just use strtol or strtof. Do you really run into this that often?

All of these are demonstrating exactly the problem I have with a lot of C authors. You build these esoteric systems that use heaps of unmaintainable, unreadable code to provide marginal gains elsewhere.


Just use strtol or strtof

Which of these is more likely to have bugs?

    int i;
    
    if (PARSENUM(&i, s, 0, 1000))
        err("Invalid input: %s", s);
or

    int i;
    long l; // need a temporary long to avoid overflow
    char * ep;
    
    errno = 0;
    l = strtol(s, &ep, 0);
    if ((ep == s) || (*ep = '\0'))  // make sure we parsed a number and don't have trailing garbage
        errno = EINVAL;
    if ((l < 0) || (l > 1000))
        errno = ERANGE;
    if (errno)
        err("Invalid input: %s", s);
    i = l;
You build these esoteric systems that use heaps of unmaintainable, unreadable code

Not at all. The point is to have self-contained routines which Just Work in order to ensure that the rest of the code is easier to read and maintain.


To be honest I would just have your library offer a function that does it your fancy way. I agree that there could be better integer parsing functions, I disagree that they should be macros.

>Not at all. The point is to have self-contained routines which Just Work in order to ensure that the rest of the code is easier to read and maintain.

Self-contained routines that are completely unmaintainable and unintelligible to anyone but you, though. Not worth it.


To be honest I would just have your library offer a function that does it your fancy way.

It's not possible to have a single function do this. Not possible to have any finite number of functions do this if, like me, you want to support all floating-point and integer types.

Self-contained routines that are completely unmaintainable and unintelligible to anyone but you, though.

All of my macros are intelligible to anyone who understands the C preprocessor.

More to the point, why do they need to be maintainable? When was the last time you maintained the strtof function in your C library?


> More to the point, why do they need to be maintainable? When was the last time you maintained the strtof function in your C library?

A few years ago. https://sourceware.org/bugzilla/show_bug.cgi?id=15744

Acting like you can get anything done right in C simply because it's self-contained is proven wrong every day. It's good practice, yes, but doesn't magically (we like this word now) make us immune to error. Everything needs to be maintainable, even if it is proven to be correct, for the simple reason that we can't just replace broken pieces of code with the same simplicity we can replace a broken fridge.

Also, if there's anything we know about code, is that we are constantly trying to invent new ways to expose it to a new interface - thereby breaking it.


> we can't just replace broken pieces of code with the same simplicity we can replace a broken fridge.

Sure we can. Recompile glibc, dynamic linking, pow.

I wouldn't say that bug you referenced is a strike against C; it could happen in any language that was locale-aware and parsing floats.

For what it's worth, these macros in libcperciva are perfectly readable and maintainable:

  #define ELASTICARRAY_DECL(type, prefix, rectype)			\
  	static inline struct prefix##_struct *				\
  	prefix##_init(size_t nrec)					\
  	{								\
  		struct elasticarray * EA;				\
  									\
  		EA = elasticarray_init(nrec, sizeof(rectype));		\
  		return ((struct prefix##_struct *)EA);			\
  	}								\
  ...
The advice should really be "be careful, it's really easy to write shitty macros, so don't".


> Sure we can. Recompile glibc, dynamic linking, pow.

Oh, so I have to maintain glibc. That's the alternative, we are in complete agreement.

> I wouldn't say that bug you referenced is a strike against C; it could happen in any language that was locale-aware and parsing floats.

I was responding to a very specific (and pointed) question.

> For what it's worth, these macros in libcperciva are perfectly readable and maintainable:

I'm not contesting that, they might be. The one you list looks fine at a glance, but that doesn't prove anything.


>It's not possible to have a single function do this. Not possible to have any finite number of functions do this if, like me, you want to support all floating-point and integer types.

Just tack an 'i' or 'f' or 'l' on the end. Not a big deal. Literally one extra character. And you don't have to hold shift the whole time.

>All of my macros are intelligible to anyone who understands the C preprocessor.

I understand the C preprocessor fine and I have to stop and mentally decode all the crap your macros are trying to do.

>More to the point, why do they need to be maintainable? When was the last time you maintained the strtof function in your C library?

All code is liable to have bugs. All code is liable to have performance issues. All code is liable to use an outdated syntax or features in 10 years. What if the stdlib does improve and adds functions that supplant these? What if they do so only partially? What if your functions could be improved to rely on a new stdlib function that's safer/faster/etc?


Just take an 'i' or 'f' or 'l' on the end. Not a big deal. Literally one extra character.

I don't think you understand. At a minimum, you would need parsenumi, parsenuml, parsenumll, parsenumimax, parsenumu, parsenumul, parsenumull, parsenumumax, parsenumi8, parsenumi16, parsenumi32, parsenumi64, parsenumu8, parsenumu16, parsenumu32, parsenumu64, parsenumf, and parsenumd.

All code is liable to use an outdated syntax or features in 10 years. What if the stdlib does improve and adds functions that supplant these? What if they do so only partially? What if your functions could be improved to rely on a new stdlib function that's safer/faster/etc?

This is C, not perl. Features don't become "outdated" in a mere 10 years. And if there is a new standard library function which is useful here... well, (a) I wouldn't want to use it for at least 20 years, and (b) I'd probably be one of the people writing said standard library function, so I'd have no difficulty updating my macros.


>I don't think you understand. At a minimum, you would need parsenumi, parsenuml, parsenumll, parsenumimax, parsenumu, parsenumul, parsenumull, parsenumumax, parsenumi8, parsenumi16, parsenumi32, parsenumi64, parsenumu8, parsenumu16, parsenumu32, parsenumu64, parsenumf, and parsenumd.

Fair. I still don't really see the value in this, though. The only real gain is from doing the range check, and that's niche enough that I'd just write a function for the particular project that demands it, and I'd only have to write one function because there'd likely only be one integer type it's relevant to.

>This is C, not perl. Features don't become "outdated" in a mere 10 years. And if there is a new standard library function which is useful here... well, (a) I wouldn't want to use it for at least 20 years

Fair enough.

>I'd probably be one of the people writing said standard library function

I hope not!


that's niche enough

Not niche at all. You should have a range check on pretty much every value you accept from a command-line option.


cperciva, sorry, but you're these type of people who should be kept away from security critical code.

It's not that I would doubt in any way your deep knowledge of the C language, quite the opposite, it's just your arrogance, your mentality of elitism and your belief, that you don't belong to the people, who are making mistakes (like "because of XYZ years of experience" bullshit).

IMHO if you want to write good code (especially safety-critical code) it is a FUCKING MUST to believe that you or others who are working on your code will make mistakes, caused by badly readable code. The mental model caused by reading your code have to be minimized as heavy as possible.

But I guess the problem of people like you with writing clean and readable code is, that you can't show your knowledge and experience built over several years. This behavior reminds be of philosophers (also social scientists) criticized by Karl Popper as "obscurantists", which are preferring to describe simple issues in a complex language to show how intellectual they are, although they could be well described in a clean and a simple way, so that everybody understands them.

I'm highly surprised that your still behaving arrogant, advocating complex macros and act as you have never made a mistake when dealing with code. Open your eyes and deal with the fact that YOU are one of the people who devaluated the whole security promise of a security critical software [0]. YOU are one of the people as we all are, that are making mistakes. Be it by deliberately removing increment operators of an IV or by a fault caused by a misunderstand of unnecessary complex code.

[0] http://www.daemonology.net/blog/2011-01-18-tarsnap-critical-...


Wouldn't it be possible to just expose parsenum_float, parsenum_signed and parsenum_unsigned instead of the macro interface?


#define IT "I think" // :)

>2 IT the swicth/cases are quicker/clearer to grasp than the one with BIGMACROS_X

>4 IT you shouldn't underestimate the existing neural paths of knowledge of stdlib for C devs. those std functions are read / parsed almost intuitively, unlike cute macros, except for their creator.

I'am ll for "cleaning" C and have a non-completely backwards dependent / improved standard, but I'd much prefer strtof than a custom macro. And those standard C functions are often accelerated to death by compilers (see: memcpy by example).


So others' argument is: I have macro I don't have to read code at all Your argument is: macros a bad, so have explicit code everywhere. This of course makes it to be readable a must.

For me clearly named macro beats the need to parse few lines of code even if they are very clear and readable.


It's worth keeping in mind that making stuff more maintainable for you does not necessarily make it more maintainable for others.

It's nice if your successor can just use your abstractions, but if he needs to modify them ...


True. That's why I'm hoping more people will start using these.


As a webdeveloper that likes to use C rarely for private things: I like that "elastic arrays" thing.


Feel free to use it. All of the libcperciva code is BSD licensed and should build and work on any C99-compliant system.


>Do not use macros. Do not use a typedef to hide a pointer or avoid writing “struct”. Avoid writing complex abstractions. Keep your build system simple and transparent. Don’t use stupid hacky crap just because it’s a cool way of solving the problem.

Heh, good luck avoiding the use of macros in sufficiently complex projects - sometimes C just can't use some control structures in an elegant manner without using macros or custom abstractions.


> Do not use macros.

Sir, yes, sir. I will throw away "offsetof" and "container_of" right now, just give me a moment.

> Do not use a typedef to hide a pointer or avoid writing “struct”

Somewhat agree with the former, but completely disagree with the latter. If there's a single thing that is not right with C is its excessive verbosity in places where none is needed.

Not typedef'ing your structs forces you to use extra 7 characters per type mention for no clear benefit. To put it differently - if NOT having "struct" in front of a type name has any effect on readability/maintainability of your code, then there are deeper problems with your coding style that won't be solved by dragging "struct" around.


>Not typedef'ing your structs forces you to use extra 7 characters per type mention for no clear benefit. To put it differently - if NOT having "struct" in front of a type name has any effect on readability/maintainability of your code, then there are deeper problems with your coding style that won't be solved by dragging "struct" around.

The benefit is to readability. You should treat structs differently from scalars, and the code should make the distinction apparent. You should not generally, for example, pass structs by value. This is just laziness.


I doubt getting confused between structs and scalars is a problem in practice. It surely never has been for me.

For me this is cargo cult maintainability: it has the sound of a good advice, but doesn't seem grounded in reality.


Why not pass structs by value?

Of course, it is copied because "by value". On the AMD64 architecture your compiler will transfer your little structs in registers.


Not sure I follow.

> scalars

So you would typedef the scalars then? If you don't, then scalars will be the built-in types (which you'd presumably know well) and then all other type names will be typedef'ed structs/unions, still making it trivial to recognize them as such.


Yes, I think typedefing scalars is fine. Typedefs are useful for abstracting the underlying storage mechanism for a scalar (so you can i.e. change it on different archictures or in a future release without breakage), not for saving yourself 6 characters of typing.


Typedefs are useful for creating short-hand names for otherwise long or complicated type definitions. Saving 7 (6 for 'struct' + space) characters is as good use for typedef as any other.


You only have to write each set of seven characters once; yes, typing will require a bit more effort. However, we shouldn't optimize code for the ease of writing, we should be optimizing for the ease of reading. Write once, read many.

`struct foo` is a bit more instructive when understanding code than `foo`; at worst, they read the same to someone familiar to the codebase, at best they prevent the need to flip back and forth to the type definitions.


>Typedefs are useful for abstracting the underlying storage mechanism for a scalar (so you can i.e. change it on different archictures or in a future release without breakage), not for saving yourself 6 characters of typing.


You can just use stdint types. Or is it really better to typedef a CUTE_INT?


That's not what I'm talking about. CUTE_INT is stupid, but maybe mylib_error makes sense. It depends on context.


There are also enums, which are user defined scalar types (although I don't think this fact really affects your argument).


I have been writing C for more than a decade and have never been confused about whether or not a type was a strict or a scalar. This is a nonsense hypothetical.


Surely a simple naming convention such as a capital letter for compound types is sufficient?


Or something as basic as

    foo_s
    bar_u
    baz_e
for structs, unions and enums respectively. Perhaps even splurge on xyz_f for function pointers... though that's getting recklessly close to the Hungarian notation :)


I love Hungarian Notation, I mean, it's pretty intuitive that LPCTSTR is a long pointer to a (zero-terminated) constant TCHAR string.

I definitely never panicked as a kid when I saw WINAPI-related code.


Can you provide an example of control structures that cannot be used in elegant manner without macros? I honestly fail to think of any example.

However I generally agree, that macros are pretty much unavoidable. Include guards are implemented using macro constants. It is pretty much the only portable way to force inlining (in some cases this might be necessary). Variadic macros are a easier to create that variadic functions (e.g. wrapper around fprintf for logging). And there is also the '_Generic' thing for people, who can use C11.


Another unavoidable use of macros is to convert symbolic names into string literals or to enable warnings for printf wrappers in a portable code.

As for include guards one has to look really hard [1] to get a compiler that does not support #pragma once.

[1] - https://en.wikipedia.org/wiki/Pragma_once#Portability


I think these are just basic general principles, not unbreakable rules. Keep them in mind but break them when you really have too. Like the quote at the beginning says, avoid doing stupid things, but don't refrain from doing clever things when you have to.


For somethings like setting up boards, typedef is sometimes required too; or at least that is what I was told in my class xD


Also, typedef make things like arrays of function pointers much easier to read.


It all depends on the level. array of pointers to func to array of structs X , yes. struct X, no.


1) Learn Compiler Design

2) Write a Compiler for a better language

3) In new language, write a compiler for your new language

4) Retire from C programming, occasionally come to Hacker News to reminisce about C programming and ways to avoid shooting yourself in the foot


Fortunately, Walter Bright did that. Now I can simply use his language to retire from C to D. :)


or

1) Learn Embedded Design

2) Stay in C for another few decades


I'm in embedded, but I have a fair measure of autonomy so I've got some Rust code on our device. If nothing else, the cross-compilation story for Rust is absolutely _beautiful_. Just stick the toolchain path in a config file somewhere.


Writing a compiler in C that can compile it's source code is for me, the best thing to do before retiring from C programming.


Did 1, 2 and 4 around 20 years ago.


TL;DR: move on, nothing serious in this article. Just barking useless advice and spitting insulting nonsense.


Just like your worthless comment. The irony.


I posted it in hope to prevent someone, somewhere, to waste some of his time.


>>Avoid magic. Do not use macros

What a put off!!!

If you are programming in C in the 21st century then you better know what you are doing. And this whole advice is for dilettantes (no offense).

C is no longer a choice language to demonstrate high level programming principles (not that you can't do it but it's not for the lazy), there's a host of other languages that do that better. But if you are interested to reach close to the machine (eg: you program needs a direct view of memory) then C is 'the' choice even today.

Look at the kernel list.h [0], it's a beautiful piece of code, and how concisely it uses macros. So the real advice to those starting out in C is to be bold and get immersed in all the things that people say you should not do and then let simplicity emerge.

In other words, 'simplicity' of the novice and of the experienced share the same word but are two different concepts from two different points of view.

[0]https://github.com/torvalds/linux/blob/master/include/linux/...


See also tree.h from freebsd [1]. If you need a generic red-black tree, this is probably your best choice in C.

[1] https://github.com/freebsd/freebsd/blob/master/sys/sys/tree....


I'm coding in C89 because I'm writing software for industrial embedded controllers. Future maintainers of this software are more likely to be electronic engineers than programmers, so "keep it simple and don't use magic" is actually extremely good advice.

That said, my code does use a couple of macros, but only to do stuff that would be very long-winded and/or impossible without, and using malloc at all is very much frowned upon: storage should either be declared ahead of time in special .var files, or on the stack.


Yeah, in general I agree with the article, the 'malloc all your memory' part is something I disagree with heavily. Outside of the fact that many malloc and calloc implementations don't particularly deal well with integer overflow[0], it introduces non-determinism into your code. MISRA C even forbids the use of dynamic memory allocation.

In fact, the environments where C shines oftentimes won't even have a dynamic allocator...

[0] Try allocating ((size_t)-1) bytes. Quite a few implementations will give you a pointer back! They'll add some space for a header, or round up the size to the nearest n-byte boundary. This problem's compounded by the fact people assume passing unsanitized input to malloc effectively sanitizes it.


This should have been shorten to "Avoid magic". This is applicable to any language, not just to C.

Magic stuff are for wizards, and most of us are not.


"Magic" becomes obvious common-sense once you understand. The overall theme of the parent comment is that we should try to understand instead of giving up, because that's what makes us learn and become better at the language.

IMHO if you want to become an expert in C and C++, you must be able to read Asm and understand what the machine is actually doing.

A very relevant article on this same idea: http://www.linusakesson.net/programming/kernighans-lever/


Magic has nothing to do with understanding assembly. Magic simply is probably undocumented piece of code (they aren't even self-documenting), that in order to understand you have to ask the author about it. There's no magic if you understand the code right away, or by looking up some reference.


> "Magic" becomes obvious common-sense once you understand.

I fully agree. Packing data into a binary array, popcount with bit fiddling, macros for generics, etc may seem "magic" to some, but they can be useful especially in performance-critical code. Once you understand these "magic", they are just some common pieces in your toolbox. C is a simple language. Learning C plus such "magic" still takes much less time than grasping the entire C++.


That is both good and bad advice. Good because yes, programmers must quickly look at others' code and be exposed to the idioms and the anti-patterns. Bad because just by looking at something the newbies don't immediately connect it with one category or another, instruction is necessary to avoid them resorting to anti-patterns (as it is ironically the case with many things Linux).


Could you give me a reason why to use macros instead of typed static inline functions?

If it's performance, where can I find something like a rationale documentation or empirical measurements of this list implementation?


I touch C only occasionally, so I'm not a guru. But I love macros. I want all languages to have macros.


> GNU is a blight on this Earth, do not let it infect your code.

Can someone explain this sentiment to me? I know about licensing and philosophical criticisms, but are there any _technical_ faults?


It's "All the world's a VAX" in its new form, where you depend on some language/library feature that's actually not in the standard (IIRC alloca and preprocessor extensions are common culprits).

And then suddenly you're on a different platform and discover that you can't rely on that -- pretty bad if it's in a central part of your system (like trampolining functions for your toy lisp).

On the other hand, you might run into the same issues if the standard support is sub par. People used to declare K&R-style functions for ages after ANSI was passed, and I wouldn't bet a central part of my system on C99.


So in your opinion the problem is "only" one of portability to non-gnu systems?

I thought the author also implied that GNU was technologically inferior and/or problematic. That would interest me...


Probably the ubiquitous "bloat" issue some die-hard C-heads have. But that's his prerogative, I was just pointing out that the specific context points towards portability issues.


There is quite a lot of GNU code written in C.

However, since gnu is a political umbrella project which does not necessarily value code quality, we see a lot of unnecessary complexity arising from the focus on creating working "free" software and not minding the quality much.

Anyone who ever touched GNU autotools will confirm that the bloat issue is real. No need to be a diehard C head.


Lots of GNU code flies in the face of these principles. A lot of GNU software encourages bad behaviors like using non-standard features (which makes for non-portable software). GNU software is also often very bloated and overengineered, and often found in critical places like glibc. Their coding styles are highly questionable and I disagree with a lot of their design decisions.


Thanks for the reply.

> GNU software is also often very bloated and overengineered, and often found in critical places like glibc.

My memory is rusty - aren't those features "protected" against accidental usage, with something like "#define GNU_SOURCE" needed before you include the headers? Or is that protection insufficient?

> Their coding styles are highly questionable and I disagree with a lot of their design decisions.

On the coding styles I agree. On the design decisions... I realize that this is a big and hard question (and that answering it probably amounts to another blog post) but could you please explain that, maybe with an example?


>My memory is rusty - aren't those features "protected" against accidental usage, with something like "#define GNU_SOURCE" needed before you include the headers? Or is that protection insufficient?

They are "protected", yes, but their mere presence encourages people to use them. There's no reason to use asprintf, but glibc makes it available so some software uses it. That software is now non-portable.

>On the coding styles I agree. On the design decisions... I realize that this is a big and hard question (and that answering it probably amounts to another blog post) but could you please explain that, maybe with an example?

Maybe in a blog post someday.


> They are "protected", yes, but their mere presence encourages people to use them. There's no reason to use asprintf, but glibc makes it available so some software uses it. That software is now non-portable.

I think asprintf is useful - it replaces an ugly "malloc-realloc-snprintf-loop"...

On exposing non-portable functions/features:

  - OpenBSD does it (pledge)

  - Freebsd/NetBSD do it (kqueue)

  - ... I'm certain other systems do too
I think _exposing_ non-portable features is ok, as long as you can't use them _accidentally_. Now, _if_ glibc fulfills that, the blame should fall on the (lazy) developer. If on the other hand glibc makes accidental use possible, then that is... bad for portability.

> Maybe in a blog post someday.

I would like to read that. :-)

[edit: formatting...]


>I think asprintf is useful - it replaces an ugly "malloc-realloc-snprintf-loop"...

Well, implement it yourself then. It's really not hard to live without, though. I don't know about this loop you're talking about but I just do this:

    int len = snprintf(NULL, 0, "fmt", ...);
    char *foo = malloc(len + 1);
    snprintf(foo, len, "fmt", ...);
Easy to wrap that up in your own asprintf function if you would find that useful. And in practice writing a lot of these functions is going to happen anyway when someone ports your non-portable code to another system.


I did not know that the return value of snprintf is actually the length of the string that would be produced. I assumed I'd just get some error code - which would require re-trying with some larger buffer, thus the "loop".

Thanks for that :-)


>Well, implement it yourself then.

Yeah, let's reinvent the wheel anytime something is not available on all platforms in the world, wether or not you intent to target them...

If I target GNU systems, why can't I use GNU features ? If people​ really want to port my code, let them do so.


One of the most evil parts is /etc/nsswitch.conf that injects at runtime arbitrary shared libraries implementing various parts of GNU C library. That leads to proliferation of undocumented file formats and protocols that are hidden behind library interfaces.

For example, Linux still does not have a sane DNS resolver interface that is useful without GNU-C library. With move to containers that lead to many issues like inability to resolve link-local names etc.


> One of the most evil parts is /etc/nsswitch.conf that injects at runtime arbitrary shared libraries implementing various parts of GNU C library. That leads to proliferation of undocumented file formats and protocols that are hidden behind library interfaces.

Are you opposed to dynamically linked libraries in general? Or do you simply oppose using them for name resolution? And how would you propose implementing NIS, LDAP/Kerberos, etc. without them (and without recompiling the libc itself)?

> For example, Linux still does not have a sane DNS resolver interface that is useful without GNU-C library. With move to containers that lead to many issues like inability to resolve link-local names etc.

Valid criticism. But that's not the GNU peoples' fault, IMHO - I mean, they have a working libc, it's not their fault that containers don't work with other libcs. (Or do i misunderstand you?)


NIS, LDAP etc should have used a unix socket interface to talk system daemons using well-defined API. This is what happens in practice in any case, but it is done over hidden protocols unusable outside glibc.

As for the resolver, then consider that on Ubuntu/Fedora etc. there is hack to set the nameserver in /etc/resolv.conf to 127.0.0.1 which runs a local caching resolver. However this is broken for local name resolution with multiple interfaces as DNS replies do not include interface name. So in a better world glibc simply talks to a system daemon using well-defined protocol. Instead right now to expose, say mdns into containers, one has to spend too much efforts with various hacks even if one does use glibc in a container application.


I think the author meant specifically GNU extensions to various POSIX functions, which can be a pain when trying to run code using them in non-GNU environment.


Yes, that is a good point.

However I thought that "blight" implies some actual technological inferiority. If there is some, I would like to know more about that.


Is there anything popular which does not have a large amount of criticism directed towards it? Anything at all?


I disagree with quite a few of them. The title should be "Principle for C programming ON UNUX BASED SYSTEMS". C programming is quite a bit wider that this, and some of the typical points here are no-no if you want to write /portable/ C.

For example, "Do not use fixed size buffers". It's all very fine, but 1) it can be exploited as well if someone managed to fudge the size you are going to allocate, and 2) on some platform, you don't have/want malloc(). So it's a lot better to have a fixed buffer and check the sizes carefully before copying into it.

Another one I dislike (but it's personal preference) is the 'use struct for pointers to structs' -- well, nope, I don't like that, it's unnecessarily heavy. I typedef my structs all the time, and call them something_t, and * something_p. It's easier to rework, rename, search for and it's quicker to type so makes the source code lighter to read. I know it's not popular, and for example the kernel guidelines agree with you, but I don't.

As for "no circumstances should you ever use gcc extensions or glibc extensions" well sorry, I also disagree here. I love the 'case X..Y:' syntax for example and it's been around for about a million years. It's not because the C standards prefer adding idiotic syntax instead of useful ones like this that I'm going to stick along and limp when there is a perfectly nice, clear and very readable alternative.

Another one I love but can't use are the sub-functions. Now what also would have been a lovely extension if the runtime had been perfected a bit, but it was never 'finished'. Speak of easier code to read when your qsort() callback is listed /just above/ the call to qsort().

Another extension is of course the __builtins that you actually do need on modern systems. Like memory barriers, compare and swaps, ffs, popcount and so on. Of course I can have an explicit function to do it (in the case of the last 2), but that's the sort of things that ought to be in the C library anyway. So I'll use these, thanks.

As far as the rest of the article about the process, your code reviewers and so on, in many places and on many projects (open source ones are a case in point) you don't have the freedom/time to do that. The rule is ' do as best as you can' -- and that ought to do it in many cases.


I used to typedef my structs but I stopped doing that a while ago. Even if something is an opaque type it's still useful to know that it's a struct and not some random typedef for an integer or similar. You might not be able to copy it for instance, and if you can you might not want it if it turns out to be a few kilobytes in size and harms performance.

The linux kernel coding style in particular forbids such typedefs: https://www.kernel.org/doc/html/latest/process/coding-style....

I don't agree with everything in the kernel coding style but it's mostly reasonable and I think their approach to typedefs is perfectly reasonable.

And why do you use separate typedefs for pointer types? That's borderline obfuscation IMO, if I'm dealing with a pointer I want to know it. If it's in order to save a single keystroke it really isn't worth it IMO and I don't see how it helps reworking anything.

I agree with the fixed size buffer thing though. There are plenty of situations where fixed size buffers are completely fine, that's way too broad as a general "principle". I guess the idea is "make sure you don't reserve less memory than you need" but that's rather obvious, isn't it?


> 'I typedef my structs all the time, and call them something_t, and * something_p’

I wish people would stop perpetuating this particular naming convention. Its in violation of POSIX which specifically reserves the entire *_t ‘namespace’. Obviously its fine if this is done on a system or environment for which this is irrelevant, but its best avoided otherwise.


The reality is that if you are creating a library you probably should prefix your types and functions anyway. And rely on the prefix to minimize collision probability. So it doesn't really matter if you put _t and the end of your type aliases. You will probably not get the collisions anyway. Unless POSIX is going to suddenly introduce mylib_array_t or something.


No, but your compiler MIGHT decide in a future release that it's a whole lot faster to ignore the header files for standards types and definitions and just copy a pre parsed version of the struct into the symbol table when the header is included. It might look at the _t and decide nope, I don't have a definition for this so it's an error, despite your own definitions.

This probably won't happen. But if it does you don't have any grounds for complaint really.


The compiler to do that would also need to drop C standard compatibility (section 7.1.3 of C99). Which is probably a good reason to complain and to just stop using that version of this purely theoretical compiler.


> For example, "Do not use fixed size buffers". It's all very fine, but 1) it can be exploited as well if someone managed to fudge the size you are going to allocate

Fuzzing really underscored just how terrible "dynamically sized buffers" are to me as well. Even if your logic is perfectly correct (e.g. no possibility of buffer overflows), something as simple as deserializing a length-prefixed array needs a quota or cap. It's not enough to handle malloc failing: Someone will successfully allocate 1.9GB via your 32-bit deserialization code, spreading the actual allocation failures across the rest of your codebase - including all 3rd party and system libraries - most of which almost certainly have at least one oversight in OOM condition error handling, invoking all kinds of potentially exploitable undefined behavior.


I prefer attempting to find an O(1)-space algorithm over dynamic allocation, which I suppose means using fixed-size buffers anyway.

In my experience it is surprising how many programmers will --- regardless of language --- tend to settle for O(n)-space or higher algorithms when just a little more thought would produce a simpler O(1). Line numbering is a common example of this.


Could you explain your point about line numbering?


Take, for example, displaying the context source code for an error when you know the file and lineNo of the error.

A naive approach would be to read the entire file into an array and then to output the lines [lineNo-context..lineNo+context], possibly with line numbers prefixed. This is O(N) memory with regards to the size of the file being processed. A 8GB source file will crash your program when built for 32-bit.

Another approach is to read and discard until you've discarded lineNo-context '\n' characters, then copy chunks from your source directly to the output until you've read another 2*context+1 characters. This is O(1) memory (strictly speaking, you could do it byte-by-byte with an integer counter or two - practically you might have a fixed size buffer to read/write faster) and would allow you to handle even 1TB files sanely.

To be fair, I'm often guilty of the naive approach myself :)


>For example, "Do not use fixed size buffers". It's all very fine, but 1) it can be exploited as well if someone managed to fudge the size you are going to allocate, and 2) on some platform, you don't have/want malloc(). So it's a lot better to have a fixed buffer and check the sizes carefully before copying into it.

This is reasonable. I mentioned measuring fixed size buffers if you have to use them, but edited it out. I think all programming guidelines should be taken with a grain of salt and adjusted as necessary when sanity demands deviations from them.

>Another one I dislike (but it's personal preference) is the 'use struct for pointers to structs' -- well, nope, I don't like that, it's unnecessarily heavy. I typedef my structs all the time, and call them something_t, and * something_p. It's easier to rework, rename, search for and it's quicker to type so makes the source code lighter to read. I know it's not popular, and for example the kernel guidelines agree with you, but I don't.

It's easier to rework, rename, and search for? How so? The problems with it is that you should be able to easily differentiate structs and scalars, becuase you should treat them differently. Same for pointers. You should generally be passing structs by reference, not by value, for example. I don't appreciate hiding information about the nature of your types for the sake of ergonomics. The readability gain trumps the extra quarter-second of typing each time you use the type.

>As for "no circumstances should you ever use gcc extensions or glibc extensions" well sorry, I also disagree here. I love the 'case X..Y:' syntax for example and it's been around for about a million years. It's not because the C standards prefer adding idiotic syntax instead of useful ones like this that I'm going to stick along and limp when there is a perfectly nice, clear and very readable alternative.

gcc is not the only compiler in the world. You can't be crying out about the Unix-specific nature of this article and then favor a dependence on gcc.

>Another one I love but can't use are the sub-functions. Now what also would have been a lovely extension if the runtime had been perfected a bit, but it was never 'finished'. Speak of easier code to read when your qsort() callback is listed /just above/ the call to qsort().

Ugh. Just make a static function.

>Another extension is of course the __builtins that you actually do need on modern systems. Like memory barriers, compare and swaps, ffs, popcount and so on. Of course I can have an explicit function to do it (in the case of the last 2), but that's the sort of things that ought to be in the C library anyway. So I'll use these, thanks.

Why do you need these? If you must, see my comments on abstracting non-standard/non-portable/etc code.


It's quite common in Embedded programming to use pools of fixed size buffers. Malloc/free work on these pools.


I tend to find fixed size buffers easier to conceptualize than dynamically allocated buffers. Tend to find that even with most modern language C#/Java I still see developer use fixed size buffer even within enterprise apps.

So I think there is something to be said with the whole movement of books/expert advice advocating against fixed size buffers/enum's to use more dynamic memory models when people in the trade are still using enum and static sizes. Anecdotally I've seen it more use of it now than any time in the past.


> even within enterprise apps.

The epitomes of today's software. I really like how you used "even".


Worked on a number of different projects clearly from a maintence perspective. From C,C++ mostly around VBS2/VBS3 (Operation Flashpoint) and VSS Simulation systems. Moved on from C/C++ simulation market when the money wasn't really in it unless you're the sales people or upper management. Kind of drives it home when you bike into work and all management are driving BMW ect.. This was a startup that I took a major pay cut.

After that phase moved into web development. Tomcat/Java and Servlet containers and C# and F#. I've spent about in total 12 to 15 year's software development in this field.

When I first started programming I learnt from Quake/Doom Engine from John carmack. I can attest his software style and technical finesse was something to be admired even to me today. There was something to be said to have a look at good written C code that was really straight forward and easy to follow. At this time OOP/Java was starting to become a mainstream in the market place and most of hype and push was from marketing and also compiler writers/contracts wanting to push their contracting sales pitch at Universities/Schools and management.

Granted the whole OOP/Java was one giant experiment that paid off for other people but never really paid off for me. In my younger years I remember not getting OOP. When I say NOT getting it, I never really felt there was a clear explanation of what OOP was and my gut feeling was the whole theory vs practicality didn't work for me. So seeing that I had to time to waste I started research and reading as many OOP books as I could find. In total I've read about 60 different OOP books and research papers. After all this I still conceptually don't GET OOP. The whole thing a act in cognitive dissonance for me.

You probably GOT OOP but I never did. So more power to you.

My only anecdotal (Single data point) experience has been maintaining a large range of different type of software over the years. From procedural code, to functional programming all the way Java, and Spring (Inverse Version Control).

A lot of smart people than me in in the 90's/2000 created these massive taxonomy systems. Multiple inheritance from anything from 10 to 15 layers deep. Excessive use of design patterns and over-use of meta-programming where meta-programming (really) didn't need to be used all the time (look at you C++). I remember night tearing my hair out because these things TOOK up the fucken wall. Litterally the taxonomy's was just that large that had so many inter-layered derived class calling Derived classes that called in turn called the base classes then would in turn call the Derived classes, that would then in turn call some event handler. Yes you get the picture.

It did take about 3-4 years of 10 hour nights to finally get this massive inheritance system (Operation Flashpoint btw) where you could be productive. Then two months after I quit.

I then moved over to Java, and web-app development. Where I doubled my wage overnight. The Java project's designs had also drunk the kool aid also during the 2000's. So yet again I was faced with 7 to 8 layer (Single) inheritance system. Each consecutive developers building on this inheritance system. As you can imagine, the cost/turn around time and budget for such a system caused then to drop the system. They settled on Spring and dependency inversion control.

Granted most of the code I see teams writing is a throw back to procedural code. You have your controller that processes requests that in turn passes it off to services. That in turn accesses Repositories. This is inclusive of your typical MVC model, but most of the services, and code now I see is just a singleton instance of the class that just acts as a namespace for functions.

Prior to this in java land, all enum's and statically allocating arrays where bad and unclean. DIK_CODE's or id identifiers where discarded into the rubbish. Replaced by developers using inheritance type system to replace such crude (primitive) approaches of programming! Looks at all those enum's lets replace them using the type system of the language!

So for example instead of writing.

#define CUSTOMER_PAGE 1

#define CUSTOMER_CHECK_OUT 2

#define ORDER_FORM 3

You had developers using the type system in its replacement.

class Validator impliments IValidator

class Orders extends Validator

class CustomerPage extends Orders

class CustomerCheckOut extends Orders

class OrderForm extends CustomerPage

Somebody is going to come here and proclaim `they were doing it wrong`. I just shrug my shoulders and say it doesn't really matter I'm stuck looking at this mess.

For today, I see developer and new projects coming online where they've moved away from such inheritance model's and moved to a more procedural approach and flat design. I do welcome this move, and cheer for the faster turn around time.

It takes me on average when faced with new projects that use massive inheritance structure's 4-5 days to get my head around the system (If at all). The same or more complicated systems using procedural old statically defined array, enum's and defines take about 1 hour to find and isolate the problem and fix.

Other teams may have had success with large scale OOP code-bases. Though I've mostly found them to be error prone, more riddled with edge case situation and bugs. They're a nightmare to extend (counter to the whole notion of the sole reason for OOP), than your typical on this ID do this approach. It's something to be said I've recently started doing development work on mmpeg and x264 code base and its a please to be up to speed and doing some productive work within 2 to 3 hours.


One of the main reasons why I visit HN is real life experience & philosophy stories like that. Thanks!


One of the video that can articulate the whole thing better than me is (https://www.youtube.com/watch?v=QM1iUe6IofM)

It was a long winded written rant. I still use OOP but its more or less glorified name spaces with functions. Today I rather work on procedural code than OOP code. It may have bugs, it may have their own little quirks but its like a old rusted vehicle that still keeps going.


Feels too heavy on "donts." I imagine the motivation the desire to be heard and understood. But I also need something simple, like "keep it simple, stupid" or "first have a working product that anyone could read". Positivity >> negativity.


Think of a circle with a dot at the center.

When you define things positively - "do this, do that" - you're saying "get to the center dot, and come from anywhere on the circle".

When you define things negatively - "don't do this, don't do that" - you're saying "get to anywhere on the circle, and come from the dot."

"Test your code". Your code could be library code, it could be core language features, it could be framework patterns, it could be executable utilities. Ends up all the same - tested code. From anywhere, to somewhere specific.

"Don't test controller actions". You can test requests, you can test at a feature level, you can test the models and functions used by the controllers. Ends up all kinds of different - no tested controllers. From somewhere specific, to anywhere.

Something telling to do or not do something doesn't mean it's positive or negative; that comes from tone and presentation, not content.


I guess there are 'donts' because people need help with the keep it simple part, they are quite comfortable with the stupid part.


"nobody's grading your code by how many abstractions and topics from a textbook it employs."


That said, people sometimes ate grading your code based on performance. Good algorithms often win over simplistic ones.


I'd say when you have big n, smart algorithms O(n) are better. But most of the time you have small n. better to optimize just the real big N in your program, for what you profiled, than using complex and "optimized" algos when N<1000.

(I cannot retrieve the article this was taken from)



I've recently been reading Scott Meyers' Effective (Modern) C++ books. They are fantastic - can anyone recommend something similar for C? I.e. books that assume you're familiar with the language, but explains pitfalls and best practices in a practical way.


21st Century C is very very good.

http://shop.oreilly.com/product/0636920033677.do

Enjoyable to read through, covers tools as well as the language, and a very good reference of good practise. It covers the basics and then goes into depth on a few of the things C is supposed to be bad at, demonstrating good ways to work (strings, threads, OOP, libraries).

I have a copy on my bookshelf at work, and have been referring to it extensively this week (exposing a C entry point to a C++ world...) It has steered me away from some fairly silly things a couple of times!

It does disagree quite strongly with the article though.


Deep C Secrets.


That book is great, very in-depth, yet fun to read.

One caveat, though, this book is from the mid-nineties, so some parts are a little dated. Still, very much worth reading.


One thing that comes to mind is that young people today are weaned on oo languages and may find it hard to adopt the good old-fashioned non-oo style typical for C.

For example, intrusive data structures may seem like an anti-pattern to oo programmers, while they feel perfectly ok for a C programmer. On the contrary, C programmer may find non-intrusive data structures to be an anti-pattern as they require extra memory allocations.


The problem with all this advice is that C lacks so many amenities of modern programming languages (no module system, no opaque types, no automatic memory management or RAII, no closures, no parametric polymorphism, no subtyping) that working around these limitations is bound to violate some of it.

The same strict rules that you can afford to adhere to in other languages do not work for C: with C, the often delicate balance of tradeoffs can tip either way.

> Do not use a typedef to hide a pointer or avoid writing “struct”.

This really seems to fly in the face of the idea of information hiding, at least when listed as an absolute. The client of a module does not need to know whether (say) a handle is an int or a pointer.

> Do not use macros.

> Never put code into a header. Never use the inline keyword.

This just doesn't strike me as good advice. You're creating potentially costly abstractions. For example:

  for (foo_init(&foo); !foo_done(&foo); foo_next(&foo)) {
    ...
  }
  
Without macros or static inline functions, you're creating overhead. Overhead that other languages with proper module systems can avoid, but not C, so you have to work around it. Sometimes it's not an issue, sometimes this is code that you want to use in a hot loop. By not having low-overhead abstractions, you're encouraging manual inlining or hacking around their lack when writing performance-critical code (and if you aren't writing performance-critical code, why are you using C)?

Macros are also the only alternative that you have in C to compensate for the lack of closures. They aren't closures, and they aren't even good macros, but cover some of the same use cases.

> Do not use fixed size buffers - always calculate how much space you’ll need and allocate it.

This is not exactly wrong, but it only talks about half of the problem. The alternative to fixed size buffers are manual memory management or alloca(). Manual memory management is also error-prone and alloca() can blow up the stack. It's not that people necessarily think that fixed size buffers are good: they're choosing between various pain points. Fixed size buffers with a generous upper limit that is properly enforced can make for a perfectly viable trade-off.

> Keep your build system simple and transparent.

This is a bit vague. What's "simple and transparent"? At some point, you'll have to deal with things like how to figure out dependencies, for example. Preprocessors and code generators are common in C projects, often to reduce common C pain points.


These points may be especially important for C programming, but most of them really apply to programming in any language.


I mostly agree, however I don't like the idea of making code easy for novices to understand as a primary goal. That's because may imply not using some powerful constructs out of fear they may not be well understood by beginners. I think it is better to imagine that the next person reading your code will be better than you and that he is going to judge you. So don't hold back, use your clever tricks, but make sure you use them correctly otherwise you will look like a fool.

Also, never say never. All non-deprecated features of the language can be used : goto, inline, macros, you name it. You just need to know how to use them wisely.


> Use only standard features. Do not assume the platform is Linux

I'll decide what platform I am targeting, thank you. I don't feel any obligation to support obscure OSes if I am just targeting linux. I might as well use useful linux-specific and GNU userland features, they are helpful.


"Do not assume" doesn't really mean or imply "do not target", so stop being defensive. It's good advice.


Targeting expresses expectations about an environment. "Do not assume" would be contradictory to those explicit expectations.


And since we can follow that thought process, we are smart enough to infer the meaning of the advice given ("do not assume") in the proper context, can we not?

It's obvious to me we both can, so why the hell are you wasting our time with useless pedantry?


Why are you inferring meaning when the article's author used two paragraphs and numerous examples to illustrate his point unambiguously?

The article is clearly heavy on the absolutes (eg: "Under no circumstances should you ever use"), and the grandparent comment critized this, and provided a good counterexample.

But you, ironically, chose to assume what the author meant by "do not assume".


OK, now I know you're trolling.


All sounds good in theory but in practice in a large enough project many of the suggestions are not practical.

Recently I was looking at rsyslog's source code.

Look at this "simple" file for example, https://github.com/rsyslog/rsyslog/blob/master/plugins/omstd...

It's an output module for rsyslog that logs to stdout.

I wanted to gouge my eyes out.


This code fails to follow many of my suggestions.

- Liberal use of macros

- Horribly unreadable coding style

- Needless use of compiler extensions

- Very poorly organized code

This is awful code because the authors are morons, not because the language is bad. They could stand to read this blog post.


While I read this thread I see how people make arguments and present working production solutions, but you often use 'terrible', 'horrible' and 'very' words to value them. Please don't take it as personal attack, but I think these words are purely emotional and should be thrown out of discussion. This is actually a rule for science papers in my country: you throw out all non-technical adjectives and auxiliary words and if text retains its sense, it is approved. If sentences begin to appear unfinished, then these are removed too, because they have no technical meaning.

We all love simple things, but many of presented techniques evolved over decades of real non-helloworld development, and not using these leads to programming errors and bloating again. All these people are not stupid — that's not 'look mom, without hands', that's professional experience.


I did find this line of code quite entertaining:

     if((r = write(1, toWrite, len)) != (int) len) { /* 1 is stdout! */
If you're going to go to the effort to add the comment, why not just use stdout instead of 1?


indeed, it's just a naive example but wouldn't something like

    int written = write(STDOUT_FILENO, toWrite, len);
    if (written != len) 
        { ...
be simpler ?


Because stdout is a FILE *. What he should have used is STDOUT_FILENO.


I'd say that's a bit naive.

How would you go about writing something like rsyslog without doing the things they have done?

How would you create rsyslog's flexible (I'm not saying it's "good") module system where people can write their own pluggable custom input/output modules?

Your comment somehow suggests that it's possible to achieve everything they have achieved in terms of functionality while doing it in a much much better way that they just don't know about.

If people writing rsyslog aren't good enough to do it then 99% of other people aren't either.

So let's not blame people, it's the weakness of the language.


Are you serious? You could start by fixing that godawful code style and removing the macros. You could change these variable names and function names to make more sense (CHKiRet? omsdRegCFSLineHdlr? What the hell?). The compiler extensions are unnecessary and can just be removed. None of these changes have any impact on the functionality of rsyslog.


Maybe you are not too familiar with rsyslog. Those are modules that get compiled into rsyslog itself.

If you attempted to create such a module system you will very soon see that doing a lot of the things that they have done is unavoidable (weird naming and so on aside).

I see this attitude where people suggest that writing beautiful secure wonderful C code is possible however just somehow all people are just too dumb to be able to do it.

Every example you show them they say "nah, that person? also too dumb".

You can find similar crap like that example that I pasted in all popular C projects including the Linux kernel and Redis.

At some point does something become not fit for purpose?

I'd say we'd make more progress if we become less tolerant of technologies that disrespect us as humans.

If only an elite subset of people are good enough to program in C. I'd say that's because C is not good enough, not because everyone else is too bad for C.


I'm not sure which parts of that you are referring to as unavoidable since C can do modules just fine without any mess. Define the functions you want to implement as static functions and add something like this at the bottom of the file:

    outputmodule_t omstdout = { .name = "omstdout", .description = "blah", .init = prepare_stdout, .do_action = write_to_stdout }
    #if COMPILING_FOR_RUNTIME_LOADING
    outputmodule_t *rsyslog_outputmodule = &omstdout;
    #endif
Then either stick `omstdout` in a list of modules known at compile-time or compile separately and use dlopen/dlsym to get the one necessary symbol. No GNUisms, no language extensions, no linker tricks, no macro magic.

C has problems but it's not that hard to create readable code in it.


Have I said anything that implies the module system is bad here? None of the guidelines in my article nor my comments here have anything to say about their module system. You could easily make this style better, name things better, and expand the macros and it would (1) work exactly the same and (2) be much better code for it.


Cool down a bit, would you?

You could get your point across without using that tone.


> This is awful code because the authors are morons, not because the language is bad.

Rainer Gerhards is a researcher who has published in peer-reviewed journals [1].

Can we please end this "people who write code in a style I don't prefer are idiots" meme?

[1]: https://www.researchgate.net/profile/Rainer_Gerhards


Good researcher and good coder is definitely not always the same (not specifically talking about Mr Gerhards - but this code is UURGH. Redis also has modules).

Most of the research code I've seen if definitely not clean code. And needn't be. (Version Control ? what's that ?)


If you don't think this code is awful then we aren't have a meaningful conversation about it. Smart people can write garbage code, too, this is great evidence for it.


Do not use magic. — Strongly agree.

Do not use macros. — Absolutely I agree: I involuntarily grimace whenever I look at and/or things like Boost MPL, or the wartier corners of the Python C-API’s underbelly, etc. I only use macros as straight-up batched ⌘-C-⌘-V:

    #define DECLARE_IT(type, value) extern const type{ value }
    DECLARE_IT(int, 0);
    DECLARE_IT(int, 1);
    DECLARE_IT(float, 0.0f); // etc
    #undef DECLARE_IT
Do not use typedefs to hide pointers […] — I cannot stand it when people do this. That asterisk is as syntactically valuable to you, the programmer, as it is essential to your program’s function. If the standard library can slap asterisks on file and directory handles than so can you (and by “you” I specifically include whoever wrote the `gzip` API among other things).

[…] or to avoid writing “struct” — Huh, actually I feel the opposite, I think all those “struct” identifiers are clutterific, much like excessive “typename” id’s in C++ template declarations. But so aside from the points where I totally disagree with the author, I absolutely feel the same way 100%.


interesting points. the maintainability part is freaking important but i guess you truly realize it only when you got bitten by it several times. some other details i don't necessarily agree with, like buffer sizes if you program for embedded devices or if you have real-time constraints.


Made me think about the line in Full Metal Jacket: "C is my programming language. There are many like it but this one is mine." https://www.youtube.com/watch?v=nkGIxGdZoYY


Thanks for the read. Very educational.

I would suggest increasing the font-weight of your bolded text. I could not differentiate the two when skimming over the article; only when I read very carefully did I notice that you had somethings bolded.


I'll wade in a little.

> Don't use macros.

> Don't use the inline keyword

> Never put code into a header

I disagree with these. It's often critical to avoid a function call in hot paths, and if you don't use macros or inline you have to resort to copy/paste, which is error-prone and hampers maintainability.

It's also the case that C is a rather inflexible language. Macros can be extraordinarily helpful in reducing boilerplate: in unit testing, for example. I won't argue that macro definitions are the easiest things to read (usually they're OK but they can get pretty arcane), but I do think something that cuts a file down from 8,000 lines to 1,000 is at least worth considering.

Macros can also help you maintain type safety. Colin Perciva demonstrated this a little with his elasticarray, but khash is another example of using macros to dynamically create type safe data structures. It can also help the compiler optimize your code.

> Don't use fixed-size buffers

Everything is a fixed-size buffer until you change its size. If you malloc a buffer of 1024 bytes and read 1025 bytes of user input into it, you overflowed anyway. The principle ought to be "check your bounds", which applies whether your buffer is stack/heap.

> Do not use a typedef to hide a pointer or avoid writing “struct”.

I'm with you on pointer hiding (looking at you FreeType), but "struct" is just far too verbose. You don't accidentally pass things by value because the compiler will tell you you're passing the wrong type. You won't think it's actually a scalar because you have a grand total of 3 scalars in C (bool/int/float), and if you don't know the types you're working with in your functions you should probably look them up.

You probably don't think this is a big issue because you don't adhere to 80 columns in your code (I looked @ your GitHub briefly), but let me tell you you run out of space real quick, and "struct" is practically meaningless.

I found a good example:

  static void set_background(struct wl_client *client, struct wl_resource *resource,
  		struct wl_resource *_output, struct wl_resource *surface) {
vs.

  static void set_background(wl_client *client, wl_resource *resource,
                                                wl_resource *_output,
                                                wl_resource *surface) {
These are obviously not ints, it's a lot easier to read, and it fits in terminals. "struct" is honestly just noise.

> GNU is a blight on this Earth

Come on now. I hate GNU indentation as much as any reasonable person, but I don't think we need to go this far.

---

Unrelated: aerc looks great! I've been thinking about moving off gmail and moving more of my life back into the terminal (I used to be all mutt and IRC and now I'm gmail and hangouts :/ ), and I really like aerc's well-organized code. Nice work.


>I disagree with these. It's often critical to avoid a function call in hot paths, and if you don't use macros or inline you have to resort to copy/paste, which is error-prone and hampers maintainability.

This advice is to be taken with a grain of salt, as with all programming advice. If you have a hot path, you should do whatever is necessary to meet the requirements, including macros or inline functions. The caviat, though, is that performance critical code that demands that is rare.

>It's also the case that C is a rather inflexible language. Macros can be extraordinarily helpful in reducing boilerplate: in unit testing, for example. I won't argue that macro definitions are the easiest things to read (usually they're OK but they can get pretty arcane), but I do think something that cuts a file down from 8,000 lines to 1,000 is at least worth considering.

I'm dissatisfied with all unit test frameworks for C that I've encountered. I briefly gave an example of how it might be done better in an unrelated blog post, would like to hear your thoughts: https://drewdevault.com/2016/07/19/Using-Wl-wrap-for-mocking...

>Macros can also help you maintain type safety. Colin Perciva demonstrated this a little with his elasticarray, but khash is another example of using macros to dynamically create type safe data structures. It can also help the compiler optimize your code.

I mentioned in response to cperciva's post that this is a tricky one. I acknowledge both sides of this discussion as valid but fall on the "just use void*" side. Not sure how it helps the optimizer out, though.

>Everything is a fixed-size buffer until you change its size. If you malloc a buffer of 1024 bytes and read 1025 bytes of user input into it, you overflowed anyway. The principle ought to be "check your bounds", which applies whether your buffer is stack/heap.

Well, I didn't say to just use fixed size buffers on the heap. I said to measure what you need and allocate that much. I probably should have phrased this more about just checking bounds in general, though.

>I'm with you on pointer hiding (looking at you FreeType), but "struct" is just far too verbose. You don't accidentally pass things by value because the compiler will tell you you're passing the wrong type. You won't think it's actually a scalar because you have a grand total of 3 scalars in C (bool/int/float), and if you don't know the types you're working with in your functions you should probably look them up.

Addressed in other comments.

>You probably don't think this is a big issue because you don't adhere to 80 columns in your code (I looked @ your GitHub briefly), but let me tell you you run out of space real quick, and "struct" is practically meaningless.

I actualy do, but I use 4 wide tabs, and as in all things I permit the occasional exception to the rule.

    static void set_background(struct wl_client client,
        struct wl_resource resource, struct wl_resource _output,
        struct wl_resource surface) {
I don't mind adding the extra newlines. It's not a big deal. These standards also evolve over time, and I've become more strict (check out chopsui for more a recent C example). I'm also lenient on columns from pull requests. I actually code on a VT220 sometimes, I do value width :)

No comment regarding GNU.

>Unrelated: aerc looks great! I've been thinking about moving off gmail and moving more of my life back into the terminal (I used to be all mutt and IRC and now I'm gmail and hangouts :/ ), and I really like aerc's well-organized code. Nice work.

Glad you like it! It's not ready for prime time, but maybe you'd be interested in contributing? I rely heavily on contributors to get so many projects done.


I'm not a mocking expert, but of all the solutions I've seen -Wl,--wrap=[blah] feels like the most straightforward, yeah. I also like how it's in your build system and you don't write a bunch of twisted code or auto-gen'd headers and redirects. It's appealing enough to make me wonder if there's an obvious reason people don't use it.

I'll put aerc on my list of projects to watch. I'm swamped with overly-ambitious hobby projects but sometimes I need a breath of fresh air :)


this video: https://www.youtube.com/watch?v=443UNeGrFoM, from a talk by eskil-steenberg titled 'how i program in c' is, imho, pretty good as well.


Great post, but most of it is relevant for all other languages as well.


These principles, at least by name, apply in general, not just to C.


Meh, skip on this one.


1. The only reason to use C these days _is_ performance and portability. Write the code to your level of expertise or use a different language to cater to the 'du jour' audience.

2. This was advice straight of Der lindens Deep C secrets back in the mid 90s. It wasn't always true then and it isn't always true now. I often use macros and don't find them a problem to understand in others code.

3. Give me a break. Except when you don't need to allocate and manage memory which is almost every time you write a tool that does one thing well. This is not a good general principle.

4. This is the way it works for me. Do it. Nope.

5. Ok, some of this is common sense writing portable code. 'GNU is a blight on this earth': "..now we see the violence inherent in the system.."

6. Yes, finally a good point but not about C per-se. 7. Ok, but this is not about C per-se. 8. Hrrmmm. A culture of blame is what produces restrictive rules like these.

This article is fodder for the Rust and other anti-c devotees here to glom onto and point out all the problems with the author and his language as the ultimate straw man. To me it sounds like this person wasn't writing C in the 90s.


In case drewdevault.com has been blacklisted by your web proxy (as it is for mine), here's an alternate source:

https://sircmpwn.github.io/2017/03/15/How-I-learned-to-stop-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: