Hacker News new | past | comments | ask | show | jobs | submit login
Smart Pointers in (GNU) C (snai.pe)
147 points by indigoabstract 30 days ago | hide | past | favorite | 35 comments



GLib also provides macros that use autocleanup.[0]

Using a bunch of macro magic, they allow you to write `g_autoptr(GPtrArray) arr = ...` to automatically unref arr when the scipe exits. One footgun that autocleanup has and C++ smart pointers don't is, that the cleanup function isn't called on reassignment, so reassigning on a autoptr causes the previous value to leak.

[0]:https://docs.gtk.org/glib/auto-cleanup.html


I think the desire for smart pointers comes from having many allocations of various lifetimes. In my opinion smart pointers are a band-aid at too low a level to solve this problem. In C++ this seems particularly attractive because it is very common to consider freeing memory and uninitializing memory to be the same thing and handle both with RAII.

C becomes much more bearable and even ergonomic when introducing multiple allocators, especially an arena. This reduces the amount of separate allocations and lifetimes dramatically. A big failure of the C standard library in my opinion is that it does not offer such allocators and the APIs don’t make use of them (which is why strings are terrible in C).


The thing I like most about C is its simplicity, its minimalism and it being the interface of choice for other languages to talk to each other.

You can completely understand the language since it's so small and doesn't change much and (if you avoid complicated pointer expressions) that also helps with reading the code.

That said, I don't enjoy manual memory management, but I'd like a GC even less.

Arena allocators are useful, but since reference counting is the preferred general solution in C++, could something similar also work in C without taking away that minimalism? I think it might, but that is just my opinion.


> You can completely understand the language since it's so small

Not really. There are many dark corners to the language, and many opportunities for subtle misunderstandings.

Plenty of C programmers don't even understand the basics of how types work in C, e.g. why you shouldn't use printf's %d format specifier with an argument of type size_t.


  > could something similar also work in C without taking away that minimalism?
maybe automatic reference counting? [1]

[1] https://en.wikipedia.org/wiki/Automatic_Reference_Counting


Yeah, I remember about that. It works well for Objective-C.

But since C doesn't have objects or messages, the best equivalent thing I can think of is for the C compiler to emit some type of signal on assignments and variables going out of scope.

I don't claim to know what's the best solution for adding automatic memory management to C, but I'm pretty sure a good one exists. And it would need compiler support for it to work.


What I wrote about is not GC. The other comments about reference counting also unfortunately completely miss the point I made.

The code stays simple as nothing is happening in the background like a GC or macro abuse. For example you clearly see where allocations can happen now in join_path(temp_arena, basedir, filepath). At the same time I‘m not calling free everywhere and can avoid a lot of the „goto cleanup“ dance or the gcc-specific extension.

Frankly I feel like understanding this approach to memory management is the best way to snuff out C experts from juniors/wannabes.


> What I wrote about is not GC. The other comments about reference counting also unfortunately completely miss the point I made.

I know. I mentioned the GC because of another comment and because reference counting and GCs are the two usual general ways (that I know of) for automating memory management. That's what I had in mind.

Unless I'm misunderstanding, arena allocators are most useful for categories of objects with similar lifetimes and are not in the fully automated category. Because if I have multiple arena allocators, I still have to manually assign an allocator when creating an object and also deal with the lifetimes of the allocators themselves.

Easier than allocating/deallocating individual objects in main memory with malloc, true, and probably more efficient, but still not fully automated, so in my mind they have different use cases.

But it's possible that I haven't really understood them, so if I'm mistaken about their use, please don't leave me in the dark. If that makes me look like a junior wannabe, I think I can take it. Even the people who know their stuff had some point when they didn't..


this is something Zig does and its a great improvement indeed!


> This sucks, but if we make the assumption that no destruction shall happen during an async context, it’s kind of fine (Oh boy, this will definitely make people jump out of their seat). On a side note, I have yet to see a proper solution to that, and I would like to avoid having double pointed types to solve the issue – if anyone has an idea, feel free to send me a message or send a pull request on the github repository.

An async context marks pointers to destroy rather than destroy them immediately (defer destruction by unqueueing them), and a dedicated thread periodically wakes and processes the queue of marked objects. It must should be processed on another thread because the object is by definition no longer reachable from the original thread since it's marked for destruction.


I'm assuming the OP is doing this specifically to avoid implementing a GC so I'm not sure this approach would be satisfactory, but it would work.


You wouldn't need to spin it up it if you don't use "async contexts", and the GC only collects objects freed during those contexts so it can be very limited. You could eliminate it entirely if your program threads periodically call into a cleanup function manually to process this queue. The point of the thread is to make this automatic since that's ostensibliy the whole point of smart pointers.


Basically a wrapper for "__attribute__(cleanup)"

GCC took the best C++ feature (destructors) and added it into C. Not sure about compatibility with other C compilers.


There is a proposal to get this using the "defer" keyword into either the next C revision or the one after that[1].

[1]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3199.htm.


It seems that whenever ISO C invent their own imitation of some GNU feature, in order to foist onto other compilers, they fuck it up first.

VLA's, inline, variadic macros, ...

More recently, they gaffed by making alignment specification not a type attribute but, get this, a storage class specifier. Ouch!


That looks a lot like what Zig does.

One problem with the proposed defer there is that it has no way to interact with stack unwinding, which is part of the platform ABIs for handling exceptions. Maybe some alternative syntax, such as "defer_finally" could make the defer block act as both a regular defer block, and also explicitly add that block to the stack unwinding chain.


That would be sublime! Defer is something I've been wanting in C for 20 years. It makes resource management clear, easy to reason about, and concise.


you can implement a crappy version of defer with GNU extensions, I think? https://gist.github.com/cozzyd/a9eb2ddb9c8785ad5c60e3280b0ba...



I'm fairly certain Clang also implements it.


Strong vibes of Greenspun's tenth rule:

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

If you want garbage collection, use a language that has it.


Smart pointers aren’t exactly the same thing as garbage collection…


Worth noting that the metadata structure can be done more efficiently with a flexible array member, but it requires alignment to alignof(max_align_t) to obey the ABI. The structure will consume the same amount of memory (16 bytes on x86-64) but you avoid a dereference and you have an extra pointer worth of space to use for whatever.


(2015) or earlier


it's great when the ratio of cmake to C is close to 3...


Why do some people hate Make so much to use an abomination like CMake? It has terrible syntax, terrible semantics, and you need to run Make anyway. I do not get it.

If it was noticeably better, I'd understand. Probably because it looks nice in the tutorial with apparently simple syntax, but then on real projects it devolves into a hundred lines of abstraction nonsense over environment variables and spawning programs.


> Why do some people hate Make so much to use an abomination like CMake?

Because Make doesn't work either. This holds for Make too:

> Probably because it looks nice in the tutorial with apparently simple syntax, but then on real projects it devolves into a hundred lines of abstraction nonsense over environment variables and spawning programs.

The real solution is of course using neither (and neither Automake). Pick one (that works for more than C, C++ and Fortran at best) build system and use that. I've choosen Buck 2 https://buck2.build/

https://github.com/Release-Candidate/Cxx-Buck2-vcpkg-Example... https://github.com/Release-Candidate/Cxx-Buck2-Conan-Example...


How about the importane of your build system lasting for a decade or more?

For all the hate about Make, it works just like it used to a decade ago.


> Make, it works just like it used to a decade ago.

So, not at all for anything not simple. It's not that we all got together one day and said "you know what, (GNU) Make does all what we need, let's build some new, more complex, alternatives". Make is fine, until it isn't, which is a point that's easy to reach with C, C++ and Fortran with modules (so, everything after Fortran 77).

And nowadays it's _way_ more homogenous as back in the day when you had all the real Unices with their own compiler and own linker and assembler and way to build and use libraries, in 32 und 64 bit on the same system. Not to forget about the oddball using VMS.


Is your deliverable source code or is it a binary?

In my experience, Make (and build scripts) work just fine when your deliverable is a binary. You have complete control of the build environment, so hardcoding e.g., /opt/my/cross/compiler or relying on $PATH is a non-issue.

However, if your deliverable is source code, then you can only make so many assumptions about the build environment. This is where GNU Autotools and CMake shine.


Make isn't cross-platform, and depends on a ton of external executables.

By all means stay with Make if UNIX is the only target, but then also take care to only use the POSIX subset.


"because it looks nice in the tutorial with apparently simple syntax, but then on real projects it devolves into a hundred lines of abstraction nonsense over environment variables and spawning programs"

you just described "make" as well as cmake. let alone autotools (./configure, Makefile.am, etc.)

"It has terrible syntax, terrible semantics, and you need to run Make anyway"

agree. the core reason, overriding all of that, is that it fundamentally does a different job. Make is an engine that takes a high-level description of rules and dependencies, builds a graph of them, and then executes on it. but in order to work correctly, you need a full dependency graph.

the problem is that when you have a large project with lots and lots of directories that build a bunch of inter-dependent artifacts, you don't necessarily like having one single big monolithic "Makefile". so of course you break it up. Every directory has a fragment that describes just what it builds, and so on.

the old way of doing that was hierarchical makefiles and "recursive make" [1] which kinda works OK, but treats each leaf makefile and directory as an independent product. the core makefile only knows coarse dependency information about those directories as a single unit. this also has pretty severe performance and parallel build issues.

the right way of doing it is some kind of multi-stage makefile system. the first stage doesn't build the product, it builds a concrete monolithic makefile containing the whole dependency graph from distributed information in each part of the project. the second stage builds the product.

you can write the first stage in python, bash, gnu-make, etc. but make is really a difficult tool to write procedural type code in. the linux build system does this though, but it's certainly not simple.

so that's all CMake is (wonky syntax and all) is doing. and it's just a bespoke system tuned specifically for doing C and C-like compilation that allows you to build a makefile (or Ninja or etc. etc.) backend. of course it has all kinds of other features as well, but to me the core difference is that make is low-level and cmake fills that role for a high-level system.

[1] https://accu.org/journals/overload/14/71/miller_2004/


My deliverable is source code; my code targets several different platforms; and my clients expect to produce builds for all supported platforms from one build environment.

Delivering with GNU Make alone would require me to either write tons of documentation for my clients to read in order to setup their build environment just right or write tons of shell script to accomplish a fraction of what GNU Autotools already does.

I'll pre-emptively state that delivering a "use this to build" Docker image is not an option, unfortunately.


CMake does make configuration easier (via its tui), and for trivial shared libraries/ simple programs it can be convenient, but once you're outside its happy zone and have to run code generators and do text processing, it quickly becomes much more unwieldy than a makefile + sed


It is time to reply to the issues that were posted since these years...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: