I've been writing a large app in C again after a few years away. Originally wrot...

umvi · on Dec 26, 2019

> Yesterday I wrote a Python script to generate C code test cases.

I do this quite a bit. Basically, anytime there is something very boiler-platy (like HTTP APIs, configuration management, test cases, etc.) I write metadata in JSON that describes what I'm adding, and then at build time a python script will parse the metadata and generate all of the C or C++ which is then compiled. I even have the python generating comments in the generated C/C++!

I used to use Jinja2 for generating the C/C++ files, but now I just use f-strings in the latest version of vanilla Python3.

thibran · on Dec 26, 2019

This is what Lisp/Scheme based languages use macros for -> to reduce boilerplate code. A lot of programmers hate DSL's, but using Python to generate test code is the same as using a DSL. The difference is, Python is a much bigger and more complicated DSL, than a test-macro will probably ever be.

There is nothing wrong with generating code in a pragmatic way. It is just strange to see all the time people using workarounds for things that a tiny DSL could probably solve better.

umvi · on Dec 26, 2019

There's another advantage to using a high level language like python though - since the "source" is now JSON metadata which is decoupled from the code, I can use that metadata to generate other interesting things. For example, I can use python-docx to generate a Word document, for example (which my company requires for documentation tracking). Or, if the metadata is describing HTTP APIs, I can generate an OpenAPI document which can be fed into Swagger UI[1].

[1] https://swagger.io/tools/swagger-ui/

cookrn · on Dec 26, 2019

This is really interesting — nice work! Have you published any tooling as OSS or examples? If you haven’t seen it before, the Model Oriented Programming work by Pieter Hintjens is worth looking at.

umvi · on Dec 26, 2019

Not yet. Most of the stuff I've done is internal to my company unfortunately. But just to give you an idea, if I were adding a new configuration parameter to our IoT device, I would add an object to a JSON array that looks like this:

    {
        "name": "backoff_mode",
        "group": "telemetry",
        "description": "We don't want to clog up networks with tons of failing attempts to send home telemetry. This parameter configures the device to back off attempts according to the specified mode",
        "cpptype": "std::string",
        "enum": ["none", "linear", "exponential"],
        "default_value": "exponential"
    }

At build time, a few things are generated, including a .h file with the following:

    struct telemetry_config {
        ...
        std::string backoff_mode = "exponential"; //Possible values: ["none", "linear", "exponential"]
    };

The boilerplate for getting and setting this parameter is also generated. Thus, just by adding that simple piece of metadata, all the boilerplate and documentation is generated and you as the developer can focus on the actual logic that needs to be implemented regarding this parameter.

i80and · on Dec 26, 2019

> better type checking of C (over Python).

Note that Python's official type checker (mypy) is pretty good (within the limits of type erasure); I started the project I've been leading up at work this past year in statically-typed Python, and it's been a great experience.

codr7 · on Dec 26, 2019

I tried to like mypy, really did. But it always ended up in some kind of circular import tailspin for me. C++ gets around the problem by allowing separation of interface and implementation, Haskell by allowing circular imports. Maybe I missed something, maybe it's fixed by now, but that was my experience a couple of years ago.

h8hawk · on Dec 26, 2019

They made a flag TYPE_CHECKING to avoid circular import in type checking. Its ugly but works.

https://docs.python.org/3/library/typing.html#typing.TYPE_CH...

weberc2 · on Dec 26, 2019

Also the circular imports was the least of the problems I’ve run into with mypy. They still can’t model a JSON type because it is recursive.

earthtolazlo · on Dec 26, 2019

Recursive types were added in Typescript 3.7. Hopefully Python takes a page from their book.

takeda · on Dec 27, 2019

Can you show example? I believe you can use TypedDict to express that.

weberc2 · on Dec 27, 2019

It’s roughly

    JSON = Union[Dict[str, JSON], List[JSON], str, bool, int, NoneType]

TypedDict isn’t for arbitrary JSON, but it is really cool.

takeda · on Dec 27, 2019

Ah I see, TypedDict won't be useful here.

Looks like there is a work in progress though[1] and people in the ticket provided some workarounds[2][3] for now.

[1] https://github.com/python/mypy/issues/731

[2] https://github.com/python/mypy/issues/731#issuecomment-53990...

[3] https://gist.github.com/catb0t/bd82f7815b7e95b5dd3c3ad294f3c...

Latty · on Dec 26, 2019

I've played with mypy a little, and it's certainly not bad, but it does suffer by comparison to typescript, which just has a lot of really great structural typing features that mypy still lacks.

Too · on Dec 26, 2019

mypy recently got structural typing. They call it Protocols, which isn't the first thing one one put into google when looking for this feature. It's not as powerful and terse as typescript unfortunately.

karlicoss · on Dec 27, 2019

They also added TypedDict recently https://mypy.readthedocs.io/en/latest/more_types.html

Latty · on Dec 28, 2019

Yeah, I didn't mean to imply no features, just far less than TS.

linuxlizard · on Dec 26, 2019

I haven't tried mypy yet but want to get to it. It sounds very interesting! After using Python so much the last several years, I've gotten more curmudgeonly about strong type checking in my programming languages.

takeda · on Dec 27, 2019

Don't wait for a new project, just start using it (is most beneficial if for every function you specify types for parameters and output). If you use an IDE that understand it like PyCharm, suddenly the autocomplete will start working right, so will refactoring and it will start highlighting potential errors.

01100011 · on Dec 26, 2019

IME, 'C with classes', or C++ but skipping the more esoteric features(OOP overuse, template metaprogramming) yields a fairly easy to understand language for most C folks. I'm a C guy and am most comfortable with it, but I like having the advanced data structures out of the box. Granted, most C codebases of any reasonable complexity have their own support for relevant data structures, but they're a lot cleaner to use via the STL.

Python is a great tool for a certain class of jobs. I saw a company doing systems programming in Python and originally thought it was a great idea, and then I learned of the hidden(or not) dangers of Python.

Yes, you definitely need to pick the right tool.

app4soft · on Dec 26, 2019

Here is AzPainter — obscure image editor for painters with PSD-format support & modern UI, written in pure C with own GUI toolkit on top of X11.[0]

Few years ago it fully rewritten from C++ to C.

[0] https://github.com/Symbian9/azpainter

squarefoot · on Dec 26, 2019

Thanks for the link! I just installed it on my i5 Thinkpad. I can't test filters and more complex stuff right now, but it loads super fast, like 150 ms uncached, and probably less than 50 ms when cached. I'm curious if there is other software around rewritten in pure C. It would likely be a huge task not worth the effort, but I can't but imagine the speed and size gains by rewriting say Firefox or Libreoffice.

app4soft · on Dec 26, 2019

> I'm curious if there is other software around rewritten in pure C.

Can't answer for now, but check next awesome lists:

• AWESOME C — https://github.com/Bfgeshka/awesome-c

• AWESOME C — https://github.com/kozross/awesome-c | mirror — https://notabug.org/koz.ross/awesome-c

• AWESOME C — https://github.com/aleksandar-todorovic/awesome-c

∗ AWESOME C++ — https://github.com/fffaraz/awesome-cpp

aninteger · on Dec 27, 2019

I think what squarefoot is looking for is lightweight (tiny?) C apps that have limited abstraction layers. For example, Gtk+, which is pure C, is no longer considered lightweight. It was with Gtk+ 1.2 because GTK was just a small wrapper around X11 in the 1.x days. That is what AzPainter is. AzPainter basically depends on xlib, xft/freetype, and some image libraries (libpng, libjpeg).

Modern C apps with a GUI typically build on GTK but that toolkit has become massive. Modern GTK3 apps are built on top of dbus, pango, atk, cairo. This is not necessarily a bad thing though and libraries like ATK provide accessibility which is important. At the time GTK+ was created the competing toolkit at the time, under Unix, was Motif and it had the nickname "bloatif". Now we've reached a point where GTK3 is much much larger and "bloated" than a 2019 Motif application.

If we include C++ the Qt and WxWidgets toolkits are also massive. FLTK is still a lightweight option though.

squarefoot · on Dec 27, 2019

"I think what squarefoot is looking for is lightweight (tiny?) C apps that have limited abstraction layers. For example, Gtk+, which is pure C, is no longer considered lightweight."

Yes, I was also meaning the ui, which in this one is amazingly fast. I tried a few filters and they seem fast too, but being not a gfx expert I have no way to judge. But the ui is incredible; same speed on my home PC which has a mechanical disk. Making an external library of its gui primitives and functions could be an interesting project to be used on small embedded boards.

app4soft · on Dec 29, 2019

> Making an external library of its gui primitives and functions could be an interesting project to be used on small embedded boards.

Think, AzPainter itself already could be used on small embedded boards ;)

For source code of AzPainter's `mlib` toolkit look here:

- https://github.com/Symbian9/azpainter/tree/master/mlib

Since AzPainter v2.1.3, `mlib` sources shipped with AzPainter now licensed under GPLv3 terms, but there is AzPainter theme editor (older `mlib` minimal demo app) which sources licensed under BSD terms:

- http://azsky2.html.xdomain.jp/linux/mthemeeditor.html

Also, there are few other apps based on `mlib` toolkit

- http://azsky2.html.xdomain.jp/linux/azpainterb.html

- http://azsky2.html.xdomain.jp/linux/azcomicv.html

- http://azsky2.html.xdomain.jp/linux/aobook.html

1996 · on Dec 26, 2019

Funny I do that in perl, and some people freak out about having code in an unfashionable language generating code in another language.

It's just equivalent to a preprocessor.

lizmat · on Jan 6, 2020

In the mid-90's, before mod_perl, I used to generate PHP in minutely cron job using a Perl script from a constantly changing database. That came close to the best of both worlds at the time: the flexibility of Perl, fed from a database, and no CGI overhead when serving.

saagarjha · on Dec 26, 2019

Code generation usually doesn’t get you any support from standard tooling. My autocomplete can see through macros, but it has no idea what C is if I was working with Perl.

marcosdumay · on Dec 26, 2019

Code generation usually sucks.

A small one-off run under your control is an exception. But people learn to fear the thing, and apply that fear everywhere.

vageli · on Dec 26, 2019

> Code generation usually sucks.

> A small one-off run under your control is an exception. But people learn to fear the thing, and apply that fear everywhere.

Why does it usually suck? Writing code that writes code seems like an optimization that would bolster productivity.

marcosdumay · on Dec 26, 2019

> Why does it usually suck?

One large factor is that when somebody really dislike the result of a code generator, they most commonly cope by editing the generated code by hand. With other metaprogramming techniques people can't do the same, so they fix the issues.

The end result is that code generators are usually a set of mostly-functioning tools that create bad code that is almost, but not exactly impossible to understand but that you will be required to change at some point.

karatinversion · on Dec 26, 2019

Most of the code generation code I’ve seen is itself opaque and generates opaque code. This makes refactoring, extending and bug fixing hard.

swiley · on Dec 26, 2019

I honestly don’t like C++ at all as a programmer, “C” is way better for hacking.

For example: I understand why the standard was written the way it was but syntactic closures in C++ are way worse than GNU C.

geofft · on Dec 26, 2019

Don't GNU C's closures/nested functions require an executable stack or something?

I remember GNU GRUB going through and getting rid of all of them.

a1369209993 · on Dec 26, 2019

The implementation uses a executable stack, but they could do it differently if they cared.

Eg, https://news.ycombinator.com/item?id=21635551 with alternating ro+x and rw-x pages holding thunks and fptr+data pairs.

saagarjha · on Dec 26, 2019

Yes, because that’s where they put trampolines that are generated on the fly. (They don’t need to do this, but that’s how it’s currently implemented.)

Gibbon1 · on Dec 26, 2019

I use them. The executable stack bit doesn't bother me since I'm on an ARM Cortex. Would be nice if they fixed that because they are kinda handy once you get used to them.

geofft · on Dec 27, 2019

> The executable stack bit doesn't bother me since I'm on an ARM Cortex.

I'm not familiar enough with ARM - what does ARM Cortex give you here?

Gibbon1 · on Dec 27, 2019

It's what it doesn't give me. Doesn't have a memory management unit, so no stack execution protection. So no protection against classic stack smashing attacks.

I saw a talk about weird machines. The upshot of it is if you can clobber the return address on the stack you can usually create a weird machine that you can then program[1]. Leads me to believe that if you are putting entrusted data and return addresses on the same stack you're in a state of sin security wise.

https://en.wikipedia.org/wiki/Weird_machine

[1] Also saw an exploit on an embedded system where they leveraged that even though the memory was locked you could set the program counter and read/write registers via jtag. Using that they were able in short order to recover the devices security keys and re-flash it with their own code.

saagarjha · on Dec 26, 2019

What are syntactic closures in GNU C?

swiley · on Dec 26, 2019

This [1] explains how to create them using nested functions. Near the end is a quick blurb about the implementation which is probably why c++ doesn’t do it that way.

[1] https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html

saagarjha · on Dec 26, 2019

Oh no, nested functions are a terrible, insecure hack that are far too verbose to be ergonomic closures :(

swiley · on Dec 27, 2019

Of course, that’s why I said it’s good for hacking.

Although I’m not sure I agree about the “ergonomic” thing.

amiga_500 · on Dec 26, 2019

Wouldn't you just use a lambda nowadays?

im3w1l · on Dec 26, 2019

Not in C you wouldn't.

saagarjha · on Dec 26, 2019

You can with Clang’s blocks extension: https://en.wikipedia.org/wiki/Blocks_(C_language_extension)

amiga_500 · on Dec 26, 2019

Sure, but the parent post was saying they didn't like nested function handling in c++.

im3w1l · on Dec 26, 2019

Nested functions don't exist in C++ (except for lambdas).

_pd19 · on Dec 26, 2019

Agreed - this feature seems useless in modern C++

JustSomeNobody · on Dec 26, 2019

I don’t like C++ anymore. I used to love it but now it’s just a bloated mess of a language. I get it, you don’t have to use every part of the language but I really don’t like any of the newer bits and that’s what everyone wants to use.

adgasf · on Dec 26, 2019

I can't imagine writing C++ without lambdas and auto

ncmncm · on Dec 26, 2019

Yes, generic lambdas in 2014 made coding C++ fun. The range library in C++20 will do it again.

etatoby · on Dec 26, 2019

Have you considered Nim?

Much better type system than C, syntax is very similar to Python, compile times are excellent and speed analogous to C++.

bbmario · on Dec 26, 2019

How do you generate the test cases? Check the types being fed in and fed out, then generate random input/output scenarios?

linuxlizard · on Dec 26, 2019

It's pretty simple. I'm decoding a structure into bit fields (802.11 Information Element). I have a hex dump that I decode into a two structures with almost 100 fields (eyeroll for 802.11 committee). Each field can only be 0, 1, or sometimes up to 15.

I have a list of the structure members. I generate an assert for each member being a certain value. The python script builds/runs the C code. On failure, I parse the assertion failure, get the actual value, change the C assert string, rebuild-rerun. Continue until the program succeeds.

I've visually verified the decode is correct once. I want to keep the decoder tested if I change the code again so I'm generating complete test coverage.

jupp0r · on Dec 26, 2019

It sounds like your use case is property based testing. Have you tried any of the QuickCheck ports to C++? My favorite is RapidCheck (https://github.com/emil-e/rapidcheck). You define generators for your struct fields and write a pure function to check whether the parsed output conforms to your expectations.

linuxlizard · on Dec 26, 2019

Checking that out now! Reminds me a little of the Python library Hypothesis. https://hypothesis.readthedocs.io/en/latest/

fulafel · on Dec 26, 2019

C has static typing but Python has strong (safe) typing. Many people would contest the claim that C is better here.

hota_mazi · on Dec 26, 2019

Strong typing is of little practical help when you're trying to read a code base with zero type annotations.

Static typing is a superior approach that makes maintainability and refactoring orders of magnitude easier.

fulafel · on Dec 26, 2019

>100x advantage in those areas does sound pretty attractive! Maybe my memories of using C are faded. If you have references, I'd be interested.

kjhrklewiou · on Dec 26, 2019

I enjoy using C++ from Python through Cython.

This way you can have the speed and memory tightness of C++ where it matters, but do the other stuff, like initial data parsing and munging or overall control and sequencing in Python, thus avoiding the general clumsiness and unproductivity of C++