
Cello High Level C: A Fat Pointer Library - nkurz
http://libcello.org/learn/a-fat-pointer-library
======
kabdib
It's all fun and games until you say

    
    
        ++p
    

and pass that to a function that is silently expecting a length and other
stuff in front of it. Calling these things "pointers" is a bit of a promotion
from what they really are. Naming them "char*" or whatever is pretty dangerous
if you ever do pointer movement.

Seriously, if you want descriptors and typed blobs, just make up a struct for
what you want and pass it around. Say what you mean. Lying like this is going
to hurt you.

~~~
tptacek
This is also a common, and sometimes exploitable, coding error with C++ smart
pointers.

I agree with you: I'm not sure how this is any more powerful than any
encapsulated ADT-style foo_t-wraps-a-void-star C library.

~~~
Buge
Well with C++ unique_ptr and shared_ptr you cant ++ or += or even get a raw
pointer without explicitly calling .get().

------
asveikau
There is all kinds of standards-abhorrent stuff and bad practices in here.

Examples from the link:

    
    
        struct Header* head = calloc(1, sizeof(struct Header) + data_size);
        head->type = type;
    

Head is not checked for null. This will segfault when heap allocations fail.
[Preemptive reply to folks used to high-level languages and the Linux
overcommit behavior: yes, heap allocations do fail, and yes, it is possible to
handle such a failure well.]

    
    
        typedef void* var;
    
        // ...
        return ((var)head) + sizeof(struct Header);
    

It's illegal to do pointer arithmetic on void pointers. You need to cast to
another type first.

    
    
        #define alloc_stack(T) header_init( \
          (char[sizeof(struct Header) + sizeof(struct T)]){0}, T)
    

That's going to crash when you try to access the T on many non-x86
architectures. On x86 you will have subtle problems like atomic ops failing to
work.

I have seen cello come up on HN before. Seems like a cute hobby project based
on some kind of flawed ideas of what "C" is. If people interested in learning
C are reading, I suggest learning C "for realsies" and avoiding this thing.

~~~
kabdib
Fully agreed about "what C is". Many, many sins have been committed in the
name of saving keystrokes or making things temporarily easier. That little
voice in the back of your head telling you that something might be a bad idea?
Listen to it, it's probably right.

On handling out-of-memory conditions:

\- Yes, you can do it

\- It's also hard to get right, in the general case, for large systems. It's
often a pyrrhic victory

Most systems I've worked with have simply restarted, rather than risk getting
complicated recovery logic wrong (and winding up in a worse situation --
corrupting persistent data, or giving wrong answers -- than if they simply
crashed). A few systems have a 'reserve tank' that they can use to do a
controlled crash, saving important state and whatnot before quitting. OOM can
get pretty wicked.

~~~
asveikau
There are large projects that handle OOM and it isn't that hard to do.

Imagine every function in your call stack handles errors consistently. In most
cases these functions will bubble up all errors to the caller. In many cases
they will perform allocations themselves and free those allocations when they
fall out of scope due to either success or error.

The function at the top of the stack hits a malloc error. It will bubble up
its status to its caller, who will do the same for his caller, etc. The chain
of functions will free intermediate allocations they made along the way. By
the time you get to some top-level or near-top-level function, you can react
to the error, and you likely even have quite a bit of heap space when the rest
of the stack frees its work. But if you don't happen to have the heap space,
then you can structure that top-level error handler so that it performs very
few allocations, does allocations at upfront at initialization time, etc.

I don't think any of this is hard and I've seen it work well in practice. It's
sad to me when I see the opposite, some unreasonable allocation quite
reasonably fails, and it takes down the entire process because whoever wrote
that code thought it was too hard to do otherwise.

~~~
hp
The reason it's hard is that everything you do has to become a "transaction"
with the ability to roll back. Say you send five messages in a function, now
you have to pre allocate all of them in order to cancel them all if any
allocation fails, before sending any. But this pre allocation tends to break
abstraction barriers (every API might need separate "prepare" and "fire"
calls). It doesn't sound that complicated at first but it gets that way in a
hurry. Almost every function can fail, every operation needs the ability to
rollback midstream... it makes a mess in a hurry.

It's also a LOT of extra code, really material bloat.

One experience I had doing this: [http://blog.ometer.com/2008/02/04/out-of-
memory-handling-d-b...](http://blog.ometer.com/2008/02/04/out-of-memory-
handling-d-bus-experience/)

If you haven't written the test harness to test almost every malloc failing,
you might think this is easier than it really is.

Adding 30-40% more code to your code base that will almost never get tested
except maybe by your unit tests ... no thanks, not if it's possibly avoidable
for a given application.

~~~
asveikau
I find it odd to see a reply written as if I haven't written in a
transactional style and seen it working well. But ignoring that for a second.
Your blog post says the error handlers did the wrong thing 5% of the time. Can
I read from that they did the right thing 95% of the time? And you will
dismiss the technique for that?

Not to mention that there are coding styles that make the transactional
approach less difficult. (OK, so reverting your work gets hairy in the
presence of certain side effects. In many cases I would rather chose some
behavior and stick with it than take down an entire process dereferencing a
null pointer.)

~~~
hp
all I'm saying is that "not hard" as you put it and "30% more code,
transactions, and a complicated test harness" don't go together for me. If
they do for you then enjoy :-)

------
huhtenberg
A pointer you get from calloc() is guaranteed to be properly aligned to not
cause SIGBUS on RISCs and some such. You cannot just add random sizeof(Header)
to it and call it a general purpose allocation. struct Header needs to be a
bit more elaborate than what they have. In fact, it shouldn't be a _struct_ ,
but a _union_ -

    
    
      union Header 
      {
          var type;
          max_align_t foo;
      };
    

whereby you get max_align_t from your standard libraries or do something like
this -

    
    
      typedef union max_align
      {
        int i;
        long l;
        double d;
        void * p;
        ...
      } max_align_t;

~~~
tomp
Are you sure this is not wasteful? I always thought you could align on any
word (32 bits) on a 32-bit x86 CPU, but your max_align_t would be 64 bits long
(because it includes a double as well).

~~~
huhtenberg
That's why you should try and use standard (or compiler-specific) version of
max_align_t.

~~~
danieltillett
max_align_t is only defined in C11?

Edit. Yes this looks to be the case for those of us using C99. Paul Eggert put
in a patch for gnulib to cover this [1].

typedef union {

    
    
      char *__p;
    
      double __d;
    
      long double __ld;
    
      long int __i;
    

} max_align_t;

One thing I did learn looking through this patch was the difference NULL has
in C++ and C. This is certainly not something I had ever considered.

1\. [https://lists.gnu.org/archive/html/bug-
gnulib/2014-12/msg001...](https://lists.gnu.org/archive/html/bug-
gnulib/2014-12/msg00170.html)

------
TazeTSchnitzel
These aren't fat pointers. A fat pointer is, well, _fat_ : it's twice the size
of a normal pointer, because it has both a memory address and a length. But
Cello's 'fat pointers' are just regular pointers that happen to point to
memory preceded by a length. Cello's 'fat pointers' don't solve the array
slice problem.

~~~
jasonwatkinspdx
Yeah, and the two ideas can be combined as well, with fat pointers that are
base pointer and offset/index pairs.

------
nocsaer1
Don't get me started...

It is called a struct. It holds arbitrary collection of any type of data
safely while enabling the lazy programmer to pass only one parameter around.
It is also safe (as must as it can be in C) unlike pointers that hide data
behind their back and are just asking for trouble.

I will ignore all the problems and point out the worst offender, stack
allocation example. That is not how you allocate arbitrary data on the stack,
automatic char array is not malloc. In fact, in C you can't do it without
using special platform specific functions. Anything else will cause undefined
behavior.

~~~
ploxiln
Variable length arrays on the stack are in the C99 standard.

(If you're using Microsoft stuff, you have my condolences.)

~~~
nocsaer1
My argument doesn't rely on their (non) availability.

------
tomp
If I understand this correctly, this are not _fat pointers_ as used in e.g.
Rust, where instead of a pointer you'd be passing in an argument a struct with
two pointers, one to the object and the other to the implementation of the
interface/typeclass.

Instead, they're more akin to vtables in C++, i.e. placing a pointer to the
interface at the beginning of the object, before the data.

I need to read the source (or generated code) to fully understand how they
implemented support for multiple interfaces/typeclasses.

~~~
ecma
I think you might be misunderstanding the intentions of this. AFAICT it's not
a mechanism for supporting multiple interfaces, just a bit of a light trickery
to get a length passed into functions within the limitations of array->pointer
decay in C. It also helps avoid cases where a miscalculated n is passed by
'encapsulating' it with the data.

Edit: I stand corrected. Just had a look over the source and this does a lot
more than the OP link indicates. The Github README is more informative.

------
mpu
Anybody who wrote a medium-sized project in C is aware of such a trick. And
also of the load of problems that comes with it! (They are not really pointers
anymore.) Please stop trying to make C look like javascript, just create a new
language from scratch, the pile of syntax tricks you're playing with is bound
to collapse.

------
yason
A subset of this kind of tricks are usually embedded in the source code of any
project of decent size or larger. But that sort of implementation usually
comes with big disclaimers that spell out the hacky parts in very loud terms.
I'm kind of wary of a library trying to glue new behaviour onto C on a general
level. There are inevitably too many corner-cases which break the newly built
abstraction. Those corner-cases can be explained away for one specific
codebase that already comes with its own set of practices, rules of thumb, and
coding patterns that the programmers are already forced to learn. But a
library as general as this is easily used liberally, even casually, and
without a complete map of the shortcomings and broken corners I can see lots
of potential hair-pulling coming right up when things are no longer what they
seem they should be.

------
asimjalis
Neat punning on C pointers with minimal increase in memory-footprint.

Some people have suggested putting array sizes in structs. But that will
increase the level of indirection and hurt performance. Also it will require
rewriting most applications.

With Cello's approach you just have to avoid pointer arithmetic. Everything
should work.

~~~
scaramanga
Only if you pass a pointer to that struct. But you can just pass it by-val and
it will act the same way as 2 arguments.

------
Arnavion
This is the same as Pascal strings, popular among Windows programmers in the
form of the BSTR type. BSTR also uses the same trick of putting the length
before the string data so that the BSTR pointer can be treated as a char
pointer.

------
sagargv
I like the idea of breathing more life into C. Look at what's happened to C++.
These days, when I need the efficiency, I prefer coding in C to C++. But I'm
not sure Cello is the right way to make C better. If you're starting a new
project, what's wrong with a struct that has an array and length of the array?
Cello seems to break the simplicity of C and its memory model.

------
Buge
For alloc it says

    
    
        head->type = type;
    

Where does type come from?

------
mrfusion
What's the ELI5 here?

~~~
codezero
Put size information before the pointer in memory.

