
Object oriented programming with ANSI-C (1993) [pdf] - geospeck
https://www.cs.rit.edu/~ats/books/ooc.pdf
======
yawaramin
The first chapter, on abstraction and information hiding, is probably the most
important. Every programmer should know how to do it. It's one of C's main
strengths that it allows abstract data types so easily. ADTs in fact allow
better binary compatibility since you only expose pointers as your public API,
not structures of different shapes and sizes.

I'm continually surprised by how often people insist on putting struct
definitions right in their header files, defeating the whole idea of type
abstraction.

~~~
sparkie
You need to define a struct in a header to pass by value, which is usually
cheaper than having to dereference a pointer (often resulting in a cache
miss). The style presented in the paper has several pointer dereferences, for
example each type has another pointer to a class header which contains its
constructor and some other function pointers.

I personally do not like the style in this paper as it makes APIs difficult to
read - you need to resort to documentation to understand anything. In the
first set example:

    
    
        #ifndef SET_H
        #define SET_H
        extern const void * Set;
        void * add (void * set, const void * element);
        void * find (const void * set, const void * element);
        void * drop (void * set, const void * element);
        int contains (const void * set, const void * element);
        #endif
    

What type is add expecting and what is it going to return? We have no clue,
other than to continue reading the docs.

A "better" approach to an opaque pointer type is to simply declare a struct in
the header, but only define it in the implementation file. The above API
becomes this, where it's pretty obvious from the arguments and return types
(it also avoids having to do an explicit manual cast in each of the methods):

    
    
        #ifndef SET_H
        #define SET_H
        #include "Object.h"
        typedef struct set_t Set;
        Object* set_add (Set*, const Object* element);
        Object* set_find (const Set*, const Object* element);
        Object* set_drop (Set*, const Object* element);
        int set_contains (const Set*, const Object* element);
        Set* set_alloc(void);
        void set_free(Set*);
        #endif
    

Obviously, this doesn't play well with the full OOP approach this paper takes,
but IMO, it's better to just compose structs using the style above, and leave
memory management up to the type rather than trying to have a fancy global
"new" and pointers to constructors.

~~~
yawaramin
Yeah, when you can't afford the cache misses I understand that you would pass
by value. But often, these things are done as premature optimisations. We can
always expose a struct's definition later; but we can't hide it once it's
exposed.

I agree with your API redesign, in fact if we forget about trying to do OOP we
would get IMO even nicer design:

    
    
        typedef struct set_t* set;
        typedef void* set_elem;
    
        set set_new(void);
        void set_free(set);
        set_elem set_add(set, set_elem);
        set_elem set_find(set, set_elem);
        set_elem set_drop(set, set_elem);
        int set_contains(set, set_elem);
    

Edit: hiding the pointer in the typdef because if we later decide to expose
the set struct, we minimise the required changes to the API.

~~~
kgabis
Hiding pointer behind a typedef is a terrible idea (like non-const reference
arguments in C++). You never know if something you pass into functions could
be modified by them or not. I'd much rather change every place in code where
set was used than deal with code that hides such important information.

~~~
yawaramin
Ha, I knew someone would say this ;-) You never know, that is, unless you just
follow the type to its declaration and find out in like a couple of seconds. A
modern IDE should just show the typedef on hover.

~~~
Too
With that argument you might as well redeclare everything with unexpected
macros, your ide will do macro expansion on hover anyway.

Or why write descriptive function names at all? Ctrl-click to view the
implementation of the function is just one click away...

Typedef pointer is a huge anti pattern.

------
sesteel
This is one of my favorite reads. About 10 years ago I was developing a toy
programming language that compiled to ANSI-C and this was a great resource to
me at the time.

------
kuwze
Reminded me of the C Object System by Laurent Deniau[0].

[0]:
[http://ldeniau.web.cern.ch/ldeniau/cos.html](http://ldeniau.web.cern.ch/ldeniau/cos.html)

------
pdfernhout
I wrote an AI Expert article on a similar theme ("Simulating Intelligent
Interacting Objects in C", AI Expert, January 1989) about using C for OO
programming of a robot simulator -- where the first argument is a pointer to a
data structure representing the object. I prefixed all the function names with
the class name, like Robot_move(robot, x, y). I was inspired by IIRC Eric
Schildt's previous article elsewhere on how OO is an attitude.

Looks like a very comprehensive book though compared to just some short
articles (even if it does not cite those as previous work). In general, this
idea of opaque handles was (and is) very common in C programming, like with
file operations. So, the question is what conventions we use or what
convenience superstructure we build around that basic idea of handles.

------
wybiral
One thing that was really educational for me was reading how CPython
implemented their object system.
[https://github.com/python/cpython/blob/master/Include/object...](https://github.com/python/cpython/blob/master/Include/object.h)

Basically objects are structs that all share the same header which contains a
pointer to the object type (ob_type) and the reference count (ob_refcnt) for
memory management.

That way you can always cast back to a pointer of that header struct and
access ob_type and ob_refcnt no matter what the actual implemented struct
contains afterwards.

ob_type contains the name of the type and function pointers for common methods
used by objects (getattr, setattr, hash, arithmetic, etc).

------
jacinabox
For me the notion of domain is helpful in understanding OO. The definition of
a domain is that everything inside it stands in determinate relations, whereas
while multiple domains can interact they don't stand in determinate relations.
Inheritance in classes, in adding more features to a class, can also have
knock-on effects that interfere with the assumptions of other classes that are
closely coupled to it, making classes an example of domain. Whole computers in
a network are also domains; nodes can go down in which case no determinate
expectations can be placed on them.

The existence of multiple domains is essentially postmodern as it creates, for
practical purposes, multiple sources of truth. Because the expectations of one
domain on another are weak, this naturally leads to a domain becoming robust
against errors and inconsistencies from other domains, which leads to a
stronger system. This type of robustness also helps components still be valid
when they are placed into a different theory/environment that has additional
properties. In some sense FP and OO are aiming at the same thing, that is
making definitions that remain valid when the "arrows" of their environment
change their meaning.

For a different perspective on this see Gerald Jay Sussman, "We Really Don't
Know How to Compute!"

------
kasajian
Another article related to this:
[https://www.codeproject.com/Articles/22139/Simply-Object-
Ori...](https://www.codeproject.com/Articles/22139/Simply-Object-Oriented-C)

I wrote it a while back just to see what I would come up with if I had to
solve that problem for myself using only macros.

------
pakl
I am a fan of libco2[1] which is a minimalist object-oriented layer on top of
C.

[1] [https://github.com/peterpaul/co2/tree/master/examples/my-
obj...](https://github.com/peterpaul/co2/tree/master/examples/my-object-
carbon/src)

------
Koshkin

      > void * find (const void * _set, const void * _element)
    

Such an eye sore, and I think it goes strongly against the tradition of coding
in C when someone uses the asterisk in the pointer context in such a way that
makes it look like an infix operator, i.e. as if it was multiplication.

As to achieving a _reasonably looking_ OO in C, this is why C++ was born...

~~~
4lch3m1st
Anytime I write C code, I try to keep the asterisk close to the type:

void* foo(void* bar);

I know that the asterisk technically belongs with the variable name when
declaring it, but it helps to wrap my head around them, as a reminder that
they're a pointer type.

~~~
sigjuice
The world mostly does it the other way.

A couple of code examples:

[https://github.com/apache/httpd/blob/trunk/server/provider.c](https://github.com/apache/httpd/blob/trunk/server/provider.c)

[https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/fork.c)

Almost all documentation, references and literature: Here are a couple of man
pages.

[http://man7.org/linux/man-
pages/man3/gethostbyname.3.html](http://man7.org/linux/man-
pages/man3/gethostbyname.3.html)

[https://developer.apple.com/library/mac/documentation/Darwin...](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/execv.3.html)

Here is a draft of the ISO C standard:

[http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf](http://www.open-
std.org/jtc1/sc22/wg14/www/docs/n1124.pdf)

~~~
ngcc_hk
I agree you based on fact that more in code we use

    
    
        void *s = then use it as 
        s=
    

But I agree more that the

    
    
        void* s = then use it as 
        s=
    

Is more understandable as the first set confuse me when I learn pointer. It
still does. void* is a type like statement. And void is like a return type
statement.

~~~
jcelerier
The only thing that sucks hard is when you do :

    
    
        int* s1, s2;
    

here s2 is an int, not an int* and you have to do

    
    
        int* s1, *s2;
    

this is braindead imho

~~~
megaman22
As long as you're not using prehistoric versions if C, you don't have to
define all variables up front, so there is much less justifiable reasons for
that kind of multiple declaration.

~~~
Koshkin
Except nothing will save you from the "one true way to write C" when you try
to write a declaration of a pointer to a function, for example.

~~~
sigjuice
You can adjust your function pointer typedefs if you like. e.g. signal(2)
[http://man7.org/linux/man-
pages/man2/signal.2.html](http://man7.org/linux/man-pages/man2/signal.2.html)

    
    
      typedef void (*sighandler_t)(int);
      sighandler_t signal(int signum, sighandler_t handler);
    

can be rewritten as

    
    
      typedef void sighandler_t(int);
      sighandler_t *signal(int signum, sighandler_t *handler);

