Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Brent's Encapsulated C Programming Rules (2020) (retroscience.net)
68 points by p2detar 13 hours ago | hide | past | favorite | 34 comments




Check against FLT_EPSILON. Oh boy.

The reason is floating point precision errors, sure, but that check is not going to solve the problems.

Took a difference of two numbers with large exponents, where the result should be algebraically zero but isn't quite numerically? Then this check fails to catch it. Took another difference of two numbers with very small exponents, where the result is not actually algebraically zero? This check says it's zero.


Yeah, at the least you'll need an understanding of ULPs[0] before you can write code that's safe in this way. And understanding ULPs means understanding that no single constant is going to be applicable across the FLT or DBL range.

[0] https://en.wikipedia.org/wiki/Unit_in_the_last_place


Other resources I like:

- Eskil Steenberg’s “How I program C” (https://youtu.be/443UNeGrFoM). Long and definitely a bit controversial in parts, but I find myself agreeing with most of it.

- CoreFoundation’s create rule (https://stackoverflow.com/questions/5718415/corefoundation-o...). I’m definitely biased but I strongly prefer this to OP’s “you declare it you free it” rule.


Thanks for the shout out. I had no idea my 2h video, without a camera 8 years ago would have such legs! I should make a new one and include why zero initialization is bad.

Thank you for recording it! :) It hits the right balance between opinionated choices with explanations and a general introduction to "post-beginner" problems which probably a lot of people who have programming experience, but not in C, face.

I can't edit my comment any longer, but I really like nullprogram.com

Same here! That’s a great blog with a lot of good advice.

void* is basically used for ad-hoc polymorphism in C, and it is a vital part of C programming.

    void new_thread(void (*run)(void*), void* context);
^- This let's us pass arbitrary starting data to a new thread.

I don't know whether this counts as "very few use cases".

The Memory Ownership advice is maybe good, but why are you allocating in the copy routine if the caller is responsible for freeing it, anyway? This dependency on the global allocator creates an unnecessarily inflexible program design. I also don't get how the caller is supposed to know how to free the memory. What if the data structure is more complex, such as a binary tree?

It's preferable to have the caller allocate the memory.

    void insert(BinTree *tree, int key, BinTreeNode *node);
^- this is preferable to the variant where it takes the value as the third parameter. Of course, an intrusive variant is probably the best.

If you need to allocate for your own needs, then allow the user to pass in an allocator pointer (I guessed on function pointer syntax):

    struct allocator { void* (*new)(size_t size, size_t alignment); void (*free)(void* p, size_t size); void* context; }.*

void* is a problem because the caller and callee need to coordinate across the encapsulation boundary, thus breaking it. (Internally it would be fine to use - the author could carefully check that qsort casts to the right type inside the .c file)

> What if the data structure is more complex, such as a binary tree?

I think that's what the author was going with by exposing opaque structs with _new() and _free() methods.

But yeah, his good and bad versions of strclone look more or less the same to me.


Curious about the allocator, why pass a size when freeing?

If you don't pass the size, the allocation subsystem has to track the size somehow, typically by either storing the size in a header or partitioning space into fixed-size buckets and doing address arithmetic. This makes the runtime more complex, and often requires more runtime storage space.

If your API instead accepts a size parameter, you can ignore it and still use these approaches, but it also opens up other possibilities that require less complexity and runtime space by relying on the client to provide this information.


The way I've implemented it now was indeed to track the size in a small header above the allocation, but this was only present in debug mode. I only deal with simple allocators like a linear, pool, and normal heap allocator. I haven't found the need for something super complex yet.

"...C is my favorite language and I love the freedom and exploration it allows me. I also love that it is so close to Assembly and I love writing assembly for much of the same reasons!"

I wonder what is author's view about user's reasons to choose a C API?

What I mean is users may want exactly the same freedom and immediacy of C that the author embraces. However, the very approach to encapsulation by hiding the layout of the memory, the use of accessor functions limits the user's freedom and robs them of performance too.

In my view, the choice of using C in projects comes with certain responsibilities and expectations from the user. Thus higher degree of trust to the API user is due.


Good stuff.

Only things I disagree with:

- The out-parameter of strclone. How annoying! I don't think this adds information. Just return a pointer, man. (And instead of defending against the possibility that someone is doing some weird string pooling, how about jut disallow that - malloc and free are your friends.)

- Avoiding void. As mentioned in another comment, it's useful for polymorphism. You can do quite nice polymorphic code in C and then you end up using void a lot.


Yes that section raised my hackles too, to the point where I'm suspicious of the whole article.

The solution, in my opinion, is to either document that strclone()'s return should be free()'d, or alternately add a strfree() declaration to the header (which might just be `#define strfree(x) free(x)`).

Adding a `char **out` arg does not, in my opinion, document that the pointer should be free()'d.


> Make sure that you turn on warnings as errors

I’m seeing this way too often. It is a good idea to never ignore a warning, an developers without discipline may need it. But for god’s sake, there is a reason why there are warnings and errors ,and they are treated differently. I don’t think compiler writers and/or C standards will deprecate warnings and make them errors anytime soon, and for good reason. So IMHO is better to treat errors as errors and warnings as warnings. I have seen plenty of times this flag is mandatory, and to avoid the warning (error) the code is decorated with compiler pacifiers, which makes no sense!

So for some setups I understand the value, but doing it all the time shows some kind of lazyness.


> and to avoid the warning (error) the code is decorated with compiler pacifiers, which makes no sense!

How is that a bad thing, exactly?

Think of it this way: The pacifiers don't just prevent the warnings. They embed the warnings within the code itself in a way where they are acknowledged by the developer.

Sure, just throwing in compiler pacifiers willy-nilly to squelch the warnings is terrible.

However, making developers explicitly write in the code "Yes, this block of code triggers a warning, and yes it's what I want to do because xyz" seems not only perfectly fine, but straight up desirable. Preventing them from pushing the code to the repo before doing so by enabling warnings-as-errors is a great way to get that done.

The only place where I've seen warnings-as-errors become a huge pain is when dealing with multiple platforms and multiple compilers that have different settings. This was a big issue in Gen7 game dev because getting the PS3's gcc, the Wii's CodeWarrior and the XBox360's MSVC to align on warnings was like herding cats, and not every dev had every devkit for obvious reason. And even then, warnings as errors was still very much worth it in the long run.


void employee_set_age(struct Employee* employee, int newAge) { // Cast away the const and set it's value, the compiler should optimize this for you (int)&employee->age = newAge; }

I believe that "Casting away the const" is UB [1]

[1]: https://en.cppreference.com/w/c/language/const.html


It's only UB if the pointed to object is actually const (in which case it might live in read-only memory).

Outstanding, why hadn't I come across this before?

Quite interesting, and felt fairly "modern" (which for C programming advice sometimes only means it's post-2000 or so). A few comments:

----

This:

    struct Vec3* v = malloc(sizeof(struct Vec3));
is better written as:

    struct Vec3 * const v = malloc(sizeof *v);
The `const` is perhaps over-doing it, but it makes it clear that "for the rest of this scope, the value of this pointer won't change" which I think is good for readability. The main point is "locking" the size to the size of the type being pointed at, rather than "freely" using `sizeof` the type name. If the type name later changes, or `Vec4` is added and code is copy-pasted, this lessens the risk of allocating the wrong amount and is less complicated.

----

This is maybe language-lawyering, but you can't write a function named `strclone()` unless you are a C standard library implementor. All functions whose names begin with "str" followed by a lower-case letter are reserved [1].

----

This `for` loop header (from the "Use utf8 strings" section:

    for (size_t i = 0; *str != 0; ++len)
is just atrocious. If you're not going to use `i`, you don't need a `for` loop to introduce it. Either delete (`for(; ...` is valid) or use a `while` instead.

----

In the "Zero Your Structs" section, it sounds as if the author recommends setting the bits of structures to all zero in order to make sure any pointer members are `NULL`. This is dangerous, since C does not guarantee that `NULL` is equivalent to all-bits-zero. I'm sure it's moot on modern platforms where implementations have chosen to represent `NULL` as all-bits-zero, but that should at least be made clear.

[1]: https://www.gnu.org/software/libc/manual/html_node/Reserved-...


    This:

    struct Vec3* v = malloc(sizeof(struct Vec3));

    is better written as:

    struct Vec3 * const v = malloc(sizeof *v);
I don't love this. Other people are going to think you're only allocating a pointer. It's potentially confusing.

I also personally find totally confusing leaving the * in the middle of nowhere, like flapping in the breeze.

Where would you put it? The const of the pointer is not the main point, it's just extra clarity that the allocated pointer is not as easily overwritten which would leak the memory.

Uh, okay, but if you need to constantly write code as if people reading it don't understand the language, then ... I don't know how to do that. :)

It's not possible to know C code and think that

    sizeof *v
and

    sizeof v
somehow mean the same thing, at least not to me.

no, but you can misread the two interchangeably no matter how familiar you are with the language.

>What this means is that you can explain all the intent of your code through the header file and the developer who uses your lib/code never has to look at the actual implementations of the code.

I hate this. If my intellisense isn't providing sufficient info (generated from doc comments), then I need to go look at the implementation. This just adds burden.

Headers are unequivocally a bad design choice, and this is why most of every language past the nineties got rid of them.


Separating interface from implementation of one of the core practices for making large code bases tractable.

Of course, but that's doable without making programmers maintain headers, and some modern languages do that.

I've found usually to poor effect. Both Rust and Haskell did away with .mli files and ended up worse for it. Haskell simplified the boundary between modules and offloaded the abstractions it was used for into its more robust type system, but it ended up lobotomizing the modularity paradigm that ML did so well.

Rust did the exact opposite and spread the interface language across keywords in source files, making it simultaneously more complicated and ultimately less powerful since it ends up lacking a cognate to higher order modules. Now the interfaces are machine generated, but the modularity paradigm ends up lobotomized all the same, and my code is littered with pub impls (pimples, as I like to call it) as though it's Java or C# or some other ugly mudball.

For Haskell, the type system at least copes for the bad module system outside of compile times. For Rust, it's hard for me to say anything positive about its approach to interfaces and modules. I'd rather just have .mlis instead of either.


Look to Ada for “headers” (i.e. specs) done right.

Recently became big Ada fanboy, ironic because Im far more a fan of minimal, succinct syntax like lisp, forth, etc and I actually successfully lobbied a professor in 1993 to _not_ use it in an undergrad software engineering class lol.

Still in the honeymoon phase granted, but I'm actually terrified that we have these new defense tech startups have no clue about Ada collectively.

Your startup MVP you wants to ship a SaaS product ASAP and iterate? Sure, grab Python or JS and whatever shitstorm of libraries you want to wrestle with.

Want to play God and write code that kills?

Total category error.

The fact that I'm sure there are at least a few of these defense tech startups yolo'ing our future away with vibe coded commits when writing code that... let's not mince our words... takes human life... prob says about how far we've fallen from "engineering".


C's text preprocessor headers were a pragmatic design choice in the 1970s. It's just that the language stuck around longer than it deserved to.

So what language is ready to take its place in the thousands of new chips that emerge every year, the new operating systems, and millions of programs written in see every year?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: