Hacker News new | past | comments | ask | show | jobs | submit login
Can you use a class in C? (d4ckard.github.io)
94 points by D4ckard on Aug 13, 2023 | hide | past | favorite | 79 comments



It took me a long time to realize that I actually prefer not using classes.

It's the small things in programming that can make a huge difference. It always felt like I should be using the C++ way because it was slightly dry-er and looked nicer, and I could have all these OO features.

Like when you see `int get_numer(void *r)` or Python's `def get_numer(self)`, or Go's `func (rational *Rational) GetNumer() (n int)`, you think: "how silly, why does the object pointer need to be in scope, just use `this`". But this tiny thing is liberating. It allows you to see that everything is just functions operating on data. And that a "method", is just a function that is taking _the entire object_ as a parameter...and hence a dependency. Which allows you to think: hmm, does this function really need to depend on the entire object...maybe it can be a separate utility function all by itself without any connection to the class. And maybe it doesn't actually need access to any of the other functions in the class...and maybe the class could be split up...etc.

I just watched a [talk](1) by Alan Kay the inventor of SmallTalk and the phrase "object-oriented" who never stops shitting on C++.

Yet OO is still absolutely everywhere.

[1]: https://www.youtube.com/watch?v=oKg1hTOQXoY


Alan Kay isn't going to agree that you should simply pass structures around and use element-level procedures. Message passing as a concept is a higher order version of what you see in C++, not what you see in C. A big reason Kay dislikes C++ is because it doesn't go far enough in terms of dynamic dispatch and the ability to substitute implementations dynamically at runtime. One way to put it is: C++ makes decisions at compile time that should be made at runtime.

What you are saying certainly applies neatly to getters and setters, but consider that classes are used for far more complex use cases than as a simple data holder. Indeed, if all you need is to store two integer components, you might not need a class.

Of course, in Smalltalk and it's ilk, you only have objects. So there is no opportunity to access data from within a structure (externally) without passing messages.


Indeed, and C++ was initially a preprocessor (CFront) by Stroustrup, dynamic dispatch was considered too slow at the time, so c++ style objects and methods were faster and as a faster version of simula [1]. Objective C was an alternative that used something closer to smalltalk but it was slower and a few years later. C++ was first released in 1983 internally at AT&T, and for perspective, the 286 was released around the same time with a clock speed of 8 MHz.

[1] https://en.wikipedia.org/wiki/C%2B%2B


What he has said applies to any member functions. It is a nuance to have to add a parameter, but certainly it does not only apply to getter and setters.

Dynamic dispatchinga are not necessarily evil, but that also doesn’t exclude C. Not sure how this idea C++ is inherently more dynamic or static comes from. For a ling time you can implement all the dynamic behavior of C++ in C, and it took C++ ‘s template as expressive as C macros.

Sometimes, less is more.that’s where C++ never gets right.


Also Bjarne Strostroup said that this (pun intended) was a mistake, since often there is no "most important" object.

If you want to do generic programming, it's much better to write f(a, b) instead of a.f(b).

Suppose I'm writing a generic algorithm that needs some function distance(A a, B b) for generic A and B. Distance is probably a commutative function, why would a.distance(b) be better than b.distance(a)? It's arbitrary.

Secondly, if functions are first-class citizens, and I don't "own" the types A and B, it's much easier to implement distance(A a, B b) myself than it is to extend the types A and B with a member function `distance`.


Also see:

Effective C++ (by Scott Meyers) Item 23:

“Prefer non-member non-friend functions to member functions. Doing so increases encapsulation, packaging flexibility, and functional extensibility.”

The text has a lot more detail but that’s a brief summary. Folks just need to read the literature then this sort of knowledge would be in common use.

One downside of language evolution is a lot of people focusing on new language features, etc, but then some of this older, important knowledge gets skipped over.


It would make a huge difference if that style were used more in practice.

OOP is just functions that are only polymorphic in 1 argument, when you want it on all of them.


Multiple dispatch for the win...


is it not apparent that a.distance(b) and b.distance(a) should both exist? if they don't, the API is poorly designed


This is why I like Rust’s associated functions concept. It is just functions and data. But with organization of clear relationships. “These functions act on these data.” (And a bit of syntactic sugar)


One disadvantage here is taking the entire struct on each operation even if only a single part of it is needed. That can be much harder to deal with due to ownership rules (if &Mut self or all of self is taken). Free functions can be easier then.


Thank you. You put into words what I was thinking.

The best paradigm in programming has to be procedural style. The cleanest code I’ve written and read has always been procedural. I even maintain a monster VBA Word Macro at work and - after some much needed refactoring - its pretty easy to understand (ignoring VBA’s warts).

The flow of logic and dependencies on data is much easier to follow in procedural.


I prefer functional. Procedural is not restrained enough and allows for mutable global state. This can go off the rails when multiple people are in the code.

Functional stresses composition so functions should stay small, and ADTs allow for all effects (errors!) to be represented. Strong typing helps reduce the number of tests I need to write, as well.


I think ADTs and effects system are orthogonal though. A “print” has an effect but I don’t know of languages where failure of that is propagated (for example in Rust, println doesn’t evaluate to Result).


I'm a self-taught programmer, and I'm the lone programmer at a small non-tech company. I've mostly learned by building, and by reading documentation on language features, libraries, etc when I get stuck. I was pretty aware of "tutorial hell" pretty early on and thus avoided it, which by proxy meant I avoided most OOP dogma. I probably worked a lot harder than necessary and reinvented the wheel a few times early on, but at this point I think I write decent code, though I admit I'm somewhat siloed (e.g. I've never been subjected to a professional code review).

I recently watched a talk on just what precisely Functional Programming is, and the whole time I just thought "wait, doesn't everyone do this? Isn't this just the overwhelmingly obvious way to write code?" Around the same time, I found myself running into some very OO projects, and simply couldn't understand the actual reason for all of the ceremony and ridiculous amounts of encapsulation. And this wasn't in Java, which of course is infamous for its seemingly infinite amount of ceremony -- this was the documentation of a widely-used GUI library for Python.

Like, to my mind, classes are useful when you need to create objects whose values are unknown, i.e. the analogy where the class is a "cookie cutter" which creates the cookies. Classes are perfectly fine for that use-case but by no means should it be that which forms the backbone of your project; my (admittedly limited) view is that stand-alone functions are far and away the lowest-overhead and reliable device for composing parts of a system.

I really get the feeling that OOP is sort of like how synths were in the 80s -- everyone just kinda lost their minds for a second and used them everywhere, whether it made sense for the music or not. But from what I can tell, OOP just isn't really the right paradigm for a whole bunch of projects, and often the design of the code vs the design of the thing feels like ramming a square peg into a round hole.

Edit: all that said, if the above smacks of the Dunning-Krueger effect, I'd love to be made aware of that.


Basic functional programming concepts are good, but it can easily go off the rails when people take it too far.


you cant reuse names with top line functions:

    package hello
    
    type cat int
    
    func greet(c cat) cat {
       return c + 1
    }
    
    type dog string
    
    func greet(d dog) dog { // greet redeclared in this block
       return d + "one"
    }
but you can with methods:

    package hello
    
    type cat int
    
    func (c cat) greet() cat {
       return c + 1
    }
    
    type dog string
    
    func (d dog) greet() dog {
       return d + "one"
    }


I don’t think that’s a strong argument. It’s just function overloading that some languages enable, others don’t. Not a categorical difference.


OK please show me the equivalent code in C then


It's possible with _Generic but there's a lot of caveats. Any seasoned C dev would just have two functions with a prefix or suffix, like cat_greet() and dog_greet().

But if you insist, it would look something like:

    typedef ... cat;
    typedef ... dog;

    void cat_greet(cat *obj) {/*do stuff */}

    void dog_greet(dog *obj) {/*do stuff */}

    #define greet(X) _Generic((X), cat: cat_greet, dog: dog_greet)(X)

    int main(...) 
    {
         cat mycat;
         dog mydog;
         greet(&mycat);
         greet(&mydog);
         return 0;
    }


that is really, really ugly. using methods, I dont have to mess with function prefixes, and I dont have to pollute the global scope.


Again that's this language constraining you to that. With procedure overloading it's not required that you have to have that ugly API. For instance the following is valid Nim:

  type
    Cat = object # these are just 'struct' in a different skin
    Dog = object

  proc greet(c: Cat): string = "Meow"
  proc greet(d: Dog): string = "Bark"

  assert Dog().greet() == "Bark"
  assert Cat().greet() == "Meow"


I have tried to sell myself on using functions (not pure functions though but ones that accept an object and modify it) but ultimately sticked to OO. I try to find the balance in-between, if something is processing heavy it's going to be a function, but if it does not do much by itself and relies on a lot of state, it's going to be a method in a class - which may call a function or a few.

The biggest benefit is the structure, you get an API that can control any aspect of the app out of the box without the need to redeclare stuff.


How would you do DI abstractions using just functions? (Honest question)

I am so much used to the idea of swapping out functionality (set of functions and a state). For example, just recently I had to swap out an Excel spreadsheet reader/writer library in Go for performance reasons, and would have bled a lot (due to refactoring) had it not been encapsulated as an interface.


Function pointers in a struct


How would either approach be different? In both cases you would have to rewrite the code that directly interacts with your Excel library.


And a well defined class defines the structure and valid permutations for that data in a clear and easy to find place.


And allows to create any number of objects of its kind without explicitly needing to allocate memory for the objects. That is a major advantage I think, without that the code will get cluttered with all the memory allocation/deallocation going on explicitly.


Nowadays OO is seen as a mistake.

Interfaces are much better. What's the difference? It's this: no inheritance.

Still, with interfaces your complaints remain unsatisfied. But add generic functions and then you can have the mix of OO-like and not-OO-like APIs you have in mind.


I don't want to imagine something like gson or sun.tools.tree without inheritance. That would be a disaster.

Composition? AssignShiftLeftExpression has-a AssignOpExpression which has-a BinaryAssignExpression which has-a BinaryExpression which has-a UnaryExpression which has-a Expression which has-a Node?

    return this.assignOpExpression.binaryAssignExpression.binaryExpression.unaryExpression.expression.node.op;
or

    return this.op


You want "sum types", aka "algebraic types".


I don't want them; you want them.


Heh, it's a manner of speaking. As in "the solution is to use sum types". Obviously i can't make you want that.


OO does not come from Smalltalk. Simula was the inspiration for OO in C++.


IME the best way to wrap C++ libraries for use in a C code base is to move at least one step higher then just wrapping every C++ class method with a C function (because this will result in an absolutely miserable C experience - you're basically writing C++ code in C, but without all the C++ syntax sugar).

Instead write a higher level module in C++ on top of the C++ library which implements some of the "application logic" and only exposes a very small app-specific (and non-OOP) C API to the rest of the application.

For instance with Dear ImGui I often write the entire UI code in C++ and then only expose a minimal C API which connects the UI code to the rest of the application (which is written in C).

Same with managing C++ object lifetimes, let the C++ side take care of this as much as possible, and don't expose pointers to C++ objects to the C side at all.


I agree, with the caveat that given a heavy duty, large, and well-understood and -designed C++ API, the tradeoff of using a tool-generated wrapper might be worth it, even if such a wrapper may be more obtuse than one specifically considered by a human developer.


> well-understood and -designed C++ API

Unfortunately these seem to be quite rare (Dear ImGui is such a well-designed C++ API, and I actually also use the code-generated cimgui bindings more frequently now in my projects (haven't tried the 'new' official bindings yet). Interestingly the Dear ImGui C++ API is much closer to a typical "flat" C API than a typical class-based C++ API (Dear ImGui is mostly just a flat soup of functions wrapped in a namespace and with some mild overloading).


Very true. Designing the C API with intention can be a way to fix the mistakes sometimes :-)


Because all pointers are the same size in C, and a pointer to a struct always points to the first member of the struct, you can do inheritance by having the "superstruct" be the first member of the substruct:

  struct foo {
    ...
  };

  struct bar {
    struct foo super;
    ...
  };
You can now safely pass a pointer to bar into a function that takes pointer to foo.


all pointers are the same size in C

Generally true, but not guaranteed by the standard: https://stackoverflow.com/questions/1241205/


From that same post, though, referencing the C 11 standard

> "All pointers to structure types shall have the same representation and alignment requirements as each other."

And so the concern wouldn't apply to this pattern?


Traditionally, pointers to functions were an exception. But I haven't red the new standards. I just assume pointers aren't all the same size, and don't try to rely on their size.


I've been out of the c/c++ game for a long time, but it's always interesting to see what the edge, or non-guaranteed, elements are going to be whenever anyone claims:

> all|every|always ... in C.


Particularly function pointers may have different sizes than what you'd expect. Thus it's best practice to always use sizeof() for the specific type of pointer you are interested in if you need to know it at runtime just as with any normal non-pointer type.


Right, my bad. It's really struct pointers that are guaranteed to be the same size.


In modern times, this only applied to MSDOS


Near, far and huge pointers; those were the days ...


My philosophy on this is to treat C++ and C as completely different languages (as they actually are), like say Java and C or python and C. Yes it's nice that some part of the header file can be parsed in both languages and that most of the syntax is similar.

But once your expectation is that you have to do all the work at the FFI boundary it's less frustrating than to experience all the small mismatches as compiler errors or annoying runtime errors.


As with all of the languages you mention, consuming C from the other language is trivial. And since C++ is a superset of C that is certainly true there. Consuming the other language from C is difficult without a C compatible runtime API to help bridge the gap.

Of course the issue for C++, unlike the other examples here, is that it does not have a runtime layer that can be used to bridge the gap. So we either write a wrapper in C++ using extern C, or use a tool to do it.

It seems there was a GSOC effort for SWIG to generate C wrappers for C++ libs but it might not have made it all the way? I don't see C as a target language on SWIG's site.

Still, a bespoke, high level design for the C wrapper is always going to be less painful for the consumer.


> And since C++ is a superset of C that is certainly true there.

C++ is not a superset of C. C99 has and C++ (at the time of writing) doesn't have: restricted pointers, designated initializers, variable length arrays, and probably more. These are language features that are actually used.

This doesn't change any of your core points.


> variable length arrays

It was such a terrible feature it was made optional in the C11 standard (you can be a conforming C11 compiler and not allow this feature) and will never, ever be implemented in a Microsoft compiler (while C is not a priority for MS, do note that they updated to C11 and C17).

You can hear their reasoning there :

https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-...

The linux kernel used to make use of the feature and removed every instance of it from the code base :

https://www.phoronix.com/news/Linux-Kills-The-VLA

> Particularly over the past several cycles there has been code eliminating the kernel's usage of VLAs and that has continued so far for this Linux 4.20~5.0 cycle. There had been more than 200 spots in the kernel relying upon VLAs but now as of the latest Linux Git code it should be basically over.

While I do agree that it is wrong to consider C++ a superset of C, it is time to forget about C99's biggest mistake and treat it as if it didn't happen.


Agreed completely on all counts.

Never ever use VLAs.

My point is really just semantic, the overall argument I was responding to is intact.


C++ has had designated initializers since c++20.


Point taken, I was inaccurate. Out of order designated initializers still don't work in C++20, right? Which still means C++ isn't a superset of C.


Any look into this will show that C's designated initializers are significantly more flexible/better in every way.


It’s amusing to me that whenever C has a feature that C++ doesn’t, C programmers hammer on the usefulness and power of said feature, but on the converse those same C programmers hammer on the divine simplicity of C.

C++ has constructors, it doesn’t really need designated initializers in the first place. They were added primarily for C compatibility.


> Of course the issue for C++, unlike the other examples here, is that it does not have a runtime layer that can be used to bridge the gap

neither, per standards, does c


Yes, but C is simple (and standard) enough that higher level runtimes have more or less universally managed to construct decent FFI mechanisms to access it.

C++ FFI would be kind of feasible if mangling were standardized (for instance) :-/


Standardizing name mangling would only be a baby step towards C++ ABI compatibility, sadly.


The only times I had to do that I.... converted the whole heap of C++ to C and used that instead. You rarely see libraries using all the most complex constructs of C++, and quite frankly, if they do, I'd rather stay clear of that pile of bloatware :-)

I wrote and shipped C++ as a job for many years -millions of lines I'm sure- and I've 'reverted' to plain C around 2007 or so, and I couldn't be happier.


> I wrote and shipped C++ as a job for many years -millions of lines I'm sure- and I've 'reverted' to plain C around 2007 or so, and I couldn't be happier.

I did too, all the way up to (IIRC) 2011. My happiness levels improved when I stopped working on C++ and, for the type of native-code problems that I used C++ for, used plain C instead.

My goal is to ship code, with a trade-off between delivery dates, runtime performance and robustness. As the problem gets more complex, I find that C++ muddies the waters even more, impacting on all three axis' above.


Hey , first time posting something I've made here I think :D. I'm excited to hear what you think about it!


I'd suggest removing the part where you use malloc to allocate and just skip to the part where you use new and link to libstdc++. At the malloc part I was wondering what the point of the whole thing was when you're basically using none of the C++ logic and rewriting everything in C.


Also the Rational usage example in C++ is using stack allocation. It's weird to then pretend like heap allocation is equivalent in the C example. I think making stack allocation work in this case would probably be difficult, but it's worth pointing out in the article that the choice is intentional and not a misunderstanding of non-new allocation in c++.

You could just use new() in the first example instead and avoid the whole issue.


I think you can do better than this void* type this equivalent - if you consider what is happening with FILE* when you use stdio.h, you have basically a class interface, and i'd follow this pattern.

There is no reason to use void*, create a distinct type which can be opaque if you want, and then you can hide the implementation details in the C++ implementation to call through to the C++ classes. You get some degree of type safety this way.


Indeed. It's super easy to create an opaque type in C: forward-declare a struct. So in the header:

    struct Rational2;
    struct Rational2 *make_rational(int,int);
Then in the file:

    Rational2 *make_rational(int n,int d){
        return (Rational2 *)new Rational(n,d);
    }

    void del_rational(Rational2 **pp){
        delete (Rational2 *)*pp;
        *pp=nullptr;
    }
And so on. You could probably arrange for it to be called Rational in both languages, starting out along these lines and then taking it from there:

    #ifdef __cplusplus
    class Rational { ... };
    #else
    struct Rational;
    typedef struct Rational Rational;
    #endif
And now you can could your C helpers from C++ as well, and the result is a genuine C++ Rational object that you can use either way. I don't think the ODR applies across languages, and I'm not 100% certain this would actually be an ODR violation anyway, at least not quite, but you'd need to ask somebody more qualified than me.

(Another suggestion I would have is to bracket the entire header in the ifdef'd extern "C" {...} block, which limits the amount of extra crap you have in the header and per function. I think you can direct clang-format not to indent these blocks.)


The converse is that if your class is just a standard layout object with a trivial destructor, you can expose an equivalent C definition without all the C++ sugar and avoid the forced allocation.

The allocation and the opaque handler can still be useful for ABI stability purposes, but that's true in C++ as well.


it's super easy to define an opaque type in c++ - you do it just like you do in c. you don't need to jump through these hoops.


Quite. And in fact, in this specific situation, in C++ you don't need to jump through any hoops at all, because you have the Rational class there already, ready for use. This whole business only exists to provide a C-friendly wrapper for this existing C++ class, along the lines of the one proposed in the article, with a couple of tweaks that I think would improve it.


Your allocation method - malloc + cast - may work for simple classes but overall it's a serious no-no; think you should either be using new/delete, or, if sticking with malloc /free, use placement new and an explicit destructor call respectively.


And the stupidest thing, he uses this malloc in a .cc file, where it would no issue to just use new. Thats the point where I stopped reading. After auto main() -> int { ... }

I didn't supposed that anything really meaningful can come...


Thanks, you spared me the horrors.


The article discusses constructors and how to make them work under "Linking the C++ standard library".

I wonder if placement new would run into the same linker problem that the article mentions -- I'll have to try it at some point :)


This is kind of what Microsoft's COM gives you. You have to write your classes in a particular way but then you get a well-defined API that can be consumed in many languages, including C.


> We successfully created a class in C++ that we can now use in C!

It's substantially easier to bind FFI to it if it's usable from C.

If it's usable from C, it's usable from umpteen other languages.

Thou shalt not write a C++ component without a C API.


Yes, but the performance penalty becomes glaringly obvious as you are constantly dereferencing (composition, relationship and function) pointers and thus generate misses..


why not forward decare a struct type in `ifndef __cplusplus` and use pointers to that type instead of `void*` pointers?


He uses malloc instead of new in a .cc file. I don't think that he realized that all what he is doing is overcomplicated at all.


class is a mistake, all C need is proper union (tagged union)

Both Rust and Zig for example nailed it




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: