I've been writing a large app in C again after a few years away. Originally wrote the app in C++ then again in Python and now in straight C. I enjoy the faster compile times of C (over C++) and the better type checking of C (over Python).
However, when I'm carefully using C++, I don't have to fiddle around with memory management and still get fast performance and better type checking.
Yesterday I wrote a Python script to generate C code test cases. String manipulation in Python is very easy.
> Yesterday I wrote a Python script to generate C code test cases.
I do this quite a bit. Basically, anytime there is something very boiler-platy (like HTTP APIs, configuration management, test cases, etc.) I write metadata in JSON that describes what I'm adding, and then at build time a python script will parse the metadata and generate all of the C or C++ which is then compiled. I even have the python generating comments in the generated C/C++!
I used to use Jinja2 for generating the C/C++ files, but now I just use f-strings in the latest version of vanilla Python3.
This is what Lisp/Scheme based languages use macros for -> to reduce boilerplate code. A lot of programmers hate DSL's, but using Python to generate test code is the same as using a DSL. The difference is, Python is a much bigger and more complicated DSL, than a test-macro will probably ever be.
There is nothing wrong with generating code in a pragmatic way. It is just strange to see all the time people using workarounds for things that a tiny DSL could probably solve better.
There's another advantage to using a high level language like python though - since the "source" is now JSON metadata which is decoupled from the code, I can use that metadata to generate other interesting things. For example, I can use python-docx to generate a Word document, for example (which my company requires for documentation tracking). Or, if the metadata is describing HTTP APIs, I can generate an OpenAPI document which can be fed into Swagger UI[1].
This is really interesting — nice work! Have you published any tooling as OSS or examples? If you haven’t seen it before, the Model Oriented Programming work by Pieter Hintjens is worth looking at.
Not yet. Most of the stuff I've done is internal to my company unfortunately. But just to give you an idea, if I were adding a new configuration parameter to our IoT device, I would add an object to a JSON array that looks like this:
{
"name": "backoff_mode",
"group": "telemetry",
"description": "We don't want to clog up networks with tons of failing attempts to send home telemetry. This parameter configures the device to back off attempts according to the specified mode",
"cpptype": "std::string",
"enum": ["none", "linear", "exponential"],
"default_value": "exponential"
}
At build time, a few things are generated, including a .h file with the following:
The boilerplate for getting and setting this parameter is also generated. Thus, just by adding that simple piece of metadata, all the boilerplate and documentation is generated and you as the developer can focus on the actual logic that needs to be implemented regarding this parameter.
Note that Python's official type checker (mypy) is pretty good (within the limits of type erasure); I started the project I've been leading up at work this past year in statically-typed Python, and it's been a great experience.
I tried to like mypy, really did. But it always ended up in some kind of circular import tailspin for me. C++ gets around the problem by allowing separation of interface and implementation, Haskell by allowing circular imports. Maybe I missed something, maybe it's fixed by now, but that was my experience a couple of years ago.
I've played with mypy a little, and it's certainly not bad, but it does suffer by comparison to typescript, which just has a lot of really great structural typing features that mypy still lacks.
mypy recently got structural typing. They call it Protocols, which isn't the first thing one one put into google when looking for this feature. It's not as powerful and terse as typescript unfortunately.
I haven't tried mypy yet but want to get to it. It sounds very interesting! After using Python so much the last several years, I've gotten more curmudgeonly about strong type checking in my programming languages.
Don't wait for a new project, just start using it (is most beneficial if for every function you specify types for parameters and output). If you use an IDE that understand it like PyCharm, suddenly the autocomplete will start working right, so will refactoring and it will start highlighting potential errors.
IME, 'C with classes', or C++ but skipping the more esoteric features(OOP overuse, template metaprogramming) yields a fairly easy to understand language for most C folks. I'm a C guy and am most comfortable with it, but I like having the advanced data structures out of the box. Granted, most C codebases of any reasonable complexity have their own support for relevant data structures, but they're a lot cleaner to use via the STL.
Python is a great tool for a certain class of jobs. I saw a company doing systems programming in Python and originally thought it was a great idea, and then I learned of the hidden(or not) dangers of Python.
Thanks for the link! I just installed it on my i5 Thinkpad. I can't test filters and more complex stuff right now, but it loads super fast, like 150 ms uncached, and probably less than 50 ms when cached.
I'm curious if there is other software around rewritten in pure C. It would likely be a huge task not worth the effort, but I can't but imagine the speed and size gains by rewriting say Firefox or Libreoffice.
I think what squarefoot is looking for is lightweight (tiny?) C apps that have limited abstraction layers. For example, Gtk+, which is pure C, is no longer considered lightweight. It was with Gtk+ 1.2 because GTK was just a small wrapper around X11 in the 1.x days. That is what AzPainter is. AzPainter basically depends on xlib, xft/freetype, and some image libraries (libpng, libjpeg).
Modern C apps with a GUI typically build on GTK but that toolkit has become massive. Modern GTK3 apps are built on top of dbus, pango, atk, cairo. This is not necessarily a bad thing though and libraries like ATK provide accessibility which is important. At the time GTK+ was created the competing toolkit at the time, under Unix, was Motif and it had the nickname "bloatif". Now we've reached a point where GTK3 is much much larger and "bloated" than a 2019 Motif application.
If we include C++ the Qt and WxWidgets toolkits are also massive. FLTK is still a lightweight option though.
"I think what squarefoot is looking for is lightweight (tiny?) C apps that have limited abstraction layers. For example, Gtk+, which is pure C, is no longer considered lightweight."
Yes, I was also meaning the ui, which in this one is amazingly fast. I tried a few filters and they seem fast too, but being not a gfx expert I have no way to judge. But the ui is incredible; same speed on my home PC which has a mechanical disk. Making an external library of its gui primitives and functions could be an interesting project to be used on small embedded boards.
Since AzPainter v2.1.3, `mlib` sources shipped with AzPainter now licensed under GPLv3 terms, but there is AzPainter theme editor (older `mlib` minimal demo app) which sources licensed under BSD terms:
In the mid-90's, before mod_perl, I used to generate PHP in minutely cron job using a Perl script from a constantly changing database. That came close to the best of both worlds at the time: the flexibility of Perl, fed from a database, and no CGI overhead when serving.
Code generation usually doesn’t get you any support from standard tooling. My autocomplete can see through macros, but it has no idea what C is if I was working with Perl.
One large factor is that when somebody really dislike the result of a code generator, they most commonly cope by editing the generated code by hand. With other metaprogramming techniques people can't do the same, so they fix the issues.
The end result is that code generators are usually a set of mostly-functioning tools that create bad code that is almost, but not exactly impossible to understand but that you will be required to change at some point.
I use them. The executable stack bit doesn't bother me since I'm on an ARM Cortex. Would be nice if they fixed that because they are kinda handy once you get used to them.
It's what it doesn't give me. Doesn't have a memory management unit, so no stack execution protection. So no protection against classic stack smashing attacks.
I saw a talk about weird machines. The upshot of it is if you can clobber the return address on the stack you can usually create a weird machine that you can then program[1]. Leads me to believe that if you are putting entrusted data and return addresses on the same stack you're in a state of sin security wise.
[1] Also saw an exploit on an embedded system where they leveraged that even though the memory was locked you could set the program counter and read/write registers via jtag. Using that they were able in short order to recover the devices security keys and re-flash it with their own code.
This [1] explains how to create them using nested functions. Near the end is a quick blurb about the implementation which is probably why c++ doesn’t do it that way.
I don’t like C++ anymore. I used to love it but now it’s just a bloated mess of a language. I get it, you don’t have to use every part of the language but I really don’t like any of the newer bits and that’s what everyone wants to use.
It's pretty simple. I'm decoding a structure into bit fields (802.11 Information Element). I have a hex dump that I decode into a two structures with almost 100 fields (eyeroll for 802.11 committee). Each field can only be 0, 1, or sometimes up to 15.
I have a list of the structure members. I generate an assert for each member being a certain value. The python script builds/runs the C code. On failure, I parse the assertion failure, get the actual value, change the C assert string, rebuild-rerun. Continue until the program succeeds.
I've visually verified the decode is correct once. I want to keep the decoder tested if I change the code again so I'm generating complete test coverage.
It sounds like your use case is property based testing. Have you tried any of the QuickCheck ports to C++? My favorite is RapidCheck (https://github.com/emil-e/rapidcheck). You define generators for your struct fields and write a pure function to check whether the parsed output conforms to your expectations.
This way you can have the speed and memory tightness of C++ where it matters, but do the other stuff, like initial data parsing and munging or overall control and sequencing in Python, thus avoiding the general clumsiness and unproductivity of C++
However, when I'm carefully using C++, I don't have to fiddle around with memory management and still get fast performance and better type checking.
Yesterday I wrote a Python script to generate C code test cases. String manipulation in Python is very easy.
The right tool for the right job, I suppose.