Metaprogramming is very alluring on the surface, we've all been frustrated by th...

ulrikrasmussen · on May 6, 2020

Metaprogramming via reflection is used heavily in the JVM ecosystem, and I can say with confidence that a majority of the bugs I encounter in third party code is somehow related to reflection. I think the overuse of reflection in Java is a symptom of the language not having adequate support for expression the abstractions that developers need in an affordable way, and hence they turn to reflection to work around these limitations. This leads developers down a path where they can't seem to stop applying reflection until they reach a point where the type system gives you almost no meaningful guarantees and compositionality of your components is ruined.

At least compile time reflection and code generation will catch a large chunk of bugs that would otherwise be deferred to runtime. I will take a puzzling compile time error message over having to debug runtime reflection errors any day.

tannhaeuser · on May 6, 2020

> I think the overuse of reflection in Java is a symptom of the language not having adequate support for expression the abstractions that developers need in an affordable way, and hence they turn to reflection to work around these

Another explanation is that playing around with annotations and AST decorations/rewriters is an excellent excuse for procrastinating through your day without having to deal with mundane business code :). I also think there's a psychological effect at work here where you're dellusioning yourself into being a "tool developer" and part of an academic discourse when you throw around annotation libs, to deflect from the harsh reality that you're working in cost-center IT. Or maybe a kind of inner migration from an enterprise code base which you can't identify with, and wish to put a fence against, as in "they" (your lib users) vs "us" (elite metaprogrogrammers). As someone else said, today's Spring-heavy Java code bases express behaviour through "anything and everything except actual Java code". C++ developers should take a look at Spring MVC in particular to see if that really is what they want, where the relative sequence of method parameters has significance, and when presented slightly different will result in service request routing 404ing, which you find out only by examining 600-items stack traces with multiple reflection pits.

ulrikrasmussen · on May 6, 2020

I think that is a very accurate analysis. There are generally two types of reflection code: the type that aims to achieve modularity and decoupling, and the kind that aims to reduce boilerplate code. I avoid both like the plague, but I think that the latter type in particular is a result of the psychological bias that developers have against writing and maintaining trivial boilerplate code vs. developing fancy tools to reduce it. In the end, you often spend much more time debugging strange issues with the reflection based alternatives than you do actually writing and maintaining the boilerplate.

This is also why we have banished all forms of reflection based serialization in favour of hand-written JSON mappings. Yes, they are a bit tedious to write, but it isn't actually that bad. As a plus, you get type errors up front, and you avoid strange errors due to e.g. third party `Map` libraries that the reflection based "magic" cannot figure out how to handle correctly. If the amount of work gets out of hand, you can always turn to code generation later if it is deemed worth it.

That being said, I think generic programming and avoidance of boilerplate does have its merits as it can help reduce the cost of abstractions. But it absolutely must be done in a principled way rather than be a result of quick and dirty hacks such as C macros, Java reflection and C++ templates, which all accidentally give you an advanced metaprogramming environment with little safety. An example of an approach that I like is the "Generics SOP" approach in Haskell, although I do recognize that the type-level programming that it involves is not for everyone: https://www.andres-loeh.de/TrueSumsOfProducts/

gpderetta · on May 6, 2020

> In the end, you often spend much more time debugging strange issues with the reflection based alternatives than you do actually writing and maintaining the boilerplate.

It pains me to say that, but I think you are right. A lot of boilerplate elimination ends up being premature generalization.

jcelerier · on May 6, 2020

> C++ developers should take a look at Spring MVC in particular to see if that really is what they want,

I don't understand why it would be bad to make it impossible to have Spring-like frameworks in C++. A metric ton of useful things have been written in it, and are only written much more painstakingly in C++.

tannhaeuser · on May 6, 2020

The obvious answer would be: why don't use Java/Spring then? Does C++ have to be everything to everybody (though that ship has probably sailed some 30 years ago)?

jcelerier · on May 6, 2020

> The obvious answer would be: why don't use Java/Spring then? Does C++ have to be everything to everybody (though that ship has probably sailed some 30 years ago)?

I really prefer writing C++ code where :

- I have the choice of the programming style for every subproblem of my software than Java code where most of the time the only choice is new-riddled, OOPish BS. Writing a Java visitor or observer pattern once again makes me shiver from dread when I'm used to std::variant and Qt's signal / slots. I'll admit that Scala mostly solves that though, if I really had to develop on the JVM that's likely the only language that I'd happily use. No type-level programming -> not relevant for me, given how many metric tons of bugs this has saved me so far.

And integrating JVM code with C++ (or any kind of native) code is an exercice in pain - I've had the displeasure to wrap one of the libraries I've developed through JNA to make it accessible to Processing, wouldn't wish that on my enemies.

- Things can be made to happen deterministically and automatically with RAII, I still have nightmares of trying to get finalizers to work in C# for instance to release resources other than memory at deterministic times and not "some time away in the future".

tannhaeuser · on May 6, 2020

Ok I can get that, though it's not that much of a problem with "finally" code blocks and modern idiomatic Java/try-with-resources. But (and I'm not pretending to be an expert here) I think attempting to write generic multithreaded service-oriented backends in a non-GCd language is going to give you a hard time with memory fragmentation (even more so with async/evented code), plus the performance, for all I know, isn't really all that great.

pjmlp · on May 6, 2020

I think you meant possible.

jcelerier · on May 6, 2020

eh, indeed, can't edit anymore. thanks !

Rexxar · on May 6, 2020

The main difference with Java is that all of this is done at compilation time. It change a lot of things imho.

sgeisler · on May 6, 2020

Runtime metaprogramming is mighty, but also dangerous. I view compile time metaprogramming as a much saner thing. The type system can still help you avoid potential problems and if in doubt you can just look at the generated code. While it doesn't solve every problem solvable by runtime mp it's good enough in most cases (e.g. building serializers for classes as done with serde in rust).

mckinney · on May 6, 2020

Indeed, the JVM sorely lacks in the reflection/metaprogrammaing department. The Manifold framework[1] picks up where Java leaves off. For instance, @Jailbreak is a badass, type-safe alternative to reflection.

[1]: https://github.com/manifold-systems/manifold

mtzet · on May 6, 2020

While I agree that metaprogramming is rife for abuse, and I'd definitely prefer if my fellow programmers used it less than they do, I'd argue that the alternative to having metaprogramming is much worse and leads to brittle black-box voodoo code.

Metaprogramming is pretty wide, ranging from primitive textual substitutions like C macros to type-aware hygienic macros ala rust, and from limited scope like C++ templates to full-on program writing like in lisp.

In C and C++ (current versions) the entire module system is built-up around the metaprogramming hack of doing #ifdef header guards. Even this is a bit error prone, but the alternative is only expressible as a compiler intrisic (#pragma once).

In languages like C, Go and early Java, the lack of generics (a type of metaprogramming) makes it impossible to write type-safe generic algorithms forcing casts to void*, interface{} and Object resp.

Implementing type checking for printf-like constructs requires compiler-intrinsics or C++17 constexpr meta-programming.

In C and C++ you must manually implement serialization and deserialization for structs and everything else that is naturally expressed and iterating over the elements of a struct. Alternatively you could use something like protobuf, which (surprise!) has compiles your protobuf file into a C++ program you can include. Using something like Rust's serde is /much/ simpler, and is only possible due to metaprogramming.

carlmr · on May 6, 2020

>Using something like Rust's serde is /much/ simpler, and is only possible due to metaprogramming.

Gotta agree here, you're shifting a lot of complexity from your code to the metaprogramming from the serde crate.

Sure it's difficult to use proc_macro style metaprogramming, but the user gets much simpler code.

psychoslave · on May 6, 2020

You can write non maintainable code at any level of abstraction.

How hard it will be to debug is more dependent on the available tools to link the error to some source code.

For example, there's a huge load of source to source compilers used in the web stack now. This is not such a big deal it seems, probably because debuggers make an adequate job to link the error to the original source. Actually I didn't directly touched Babylscript and so on, but I didn't saw much complaint about traceability of errors, so here I just guess: a more informed point of view would be valuable here.

Self-modifying code might have been its purpose in highly resource constrained environment. But otherwise, in my opinion, generating a whole distinct source or tailoring a runnable AST has always been more understandable while offering the same level of flexibility.

logicchains · on May 6, 2020

>But, I think this trend might lead to extremely hard to read code, and there is a good chance that this hard-to-read code will be treated as some black box/voodoo.

This already happens with metaprogramming in C++. See e.g. the source of the Boost Preprocessor library. Circle just makes this code a bit easier to read. Even code using the library: compare the compile-time Duff's device generation in https://www.boost.org/doc/libs/1_72_0/libs/preprocessor/doc/... to that in https://github.com/seanbaxter/circle/blob/master/examples/RE... .

MauranKilom · on May 6, 2020

I disagree with using Boost Preprocessor C code as an example for C++ metaprogramming.

orbifold · on May 6, 2020

I think LLVM and its Tablegen mechanism is a good example that large C++ projects almost inevitably will contain some code generation facilities. In that case this mechanism is rather poorly documented and is used to generate 10+ different targets.

I believe the facilities provided by circle would make most of the Tablegen infrastructure redundant.

pjmlp · on May 6, 2020

I guess most of that can already be replaced by constexpr/consteval.

jimbob45 · on May 6, 2020

I no longer share your viewpoint. Not selling someone a footgun only ensures that they either glue a footgun onto whatever you sell them or go buy from another vendor.

jcelerier · on May 6, 2020

Exactly, the alternative to not having code generation as a language feature is not people not doing it, it's :

- people writing external code generators (moc, MIDL, tinyrefl, a random python script, and a hundred other possibilities)

- people doing code generation in their build system (CMake allows that relatively easily for instance)

- people doing code generation with macros (verdigris instead of moc, boost.pp)

- people writing clang extensions which get outdated in 6 months due to LLVM code churn (https://github.com/AustinBrunkhorst/CPP-Reflection and a few others)

e.g. just look at one of the latest C++ questions on SO : https://stackoverflow.com/a/61623940/1495627

People are ressorting to friggin bash scripts because they don't have that feature. Between unmaintainable bash scripts, and type-checked C++ code, what do you think is better ?

BruceEel · on May 6, 2020

Recently, ( https://news.ycombinator.com/item?id=23055121 ), Walter Bright pointed out that although D has extensive compile-time meta programming support, system (and, I imagine, arbitrary dll) calls are explicitly not allowed because of security concerns.

If I understand the Circle docs correctly...

   [...] searched for in the pre-loaded standard binaries: libc, libm, libpthread, libstdc++ and libc++abi. 
   Additional libraries may be loaded with the -M compiler switch. When the requested function is found, 
   a foreign-function call is made, [...]

...all libraries are fair game? And I guess you might be able to do your own function/dll probing with libdl.

Do you guys see this as a feature or a liability?

Reelin · on May 6, 2020

I love the D language but I actually think its designers got that point wrong.

Compilers are _far_ from security hardened and an attacker slipping something evil into the output binary is probably equally as bad anyway (you distribute it to your users after all). Ultimately you shouldn't be compiling code you don't trust without a good reason and appropriate precautions.

As a counterexample, as far as I'm aware Common Lisp makes no distinction between execution that occurs at compile time versus run time. It still seems to be doing pretty well though!

guenthert · on May 6, 2020

> Common Lisp makes no distinction between execution that occurs at compile time versus run time.

That's not quite phrased correctly. Common Lisp very well makes such a distinction, but allows code to be executed at compile time, load time or run time. See http://www.lispworks.com/documentation/HyperSpec/Body/s_eval... for detaills.

rurban · on May 6, 2020

Common Lisp programmers know what they do. We do have the same problem in perl, where people don't get the difference of a BEGIN block to an INIT block.

With C++ the template syntax is so horrible convoluted, that I doubt people get the idea of compile-time expressions. What is allowed, and what forbidden.

rurban · on May 6, 2020

The problem with compile-time expressions are side-effects.

They are only done locally, which is sometimes not what you want. syscalls, fileio, Config checks are not done at runtime, and this is for 99% a bug. You really need to know what you do. And each such sideffect is only done once, when you run the compiler. Not at the client.

kristiandupont · on May 6, 2020

I agree. This is very impressive tech and I definitely see the appeal. I once spent a lot of energy attacking the same problem from the other end so to speak, by generating code at runtime (https://github.com/kristiandupont/rtasm/). I managed to make some logic that had many levels of loops and conditions perform extremely well with it but when I had to debug that stuff a year later, I was basically at my wit's end.

Currently, I am writing JS and TS code and I do quite a bit of code generation. It's great -- it goes into my repo so I get nice diffs when I make a change, I have easily debuggable code and my generator-code can be "unclean" and support weird edge cases through simple if-statements. Of course, my younger self would feel contempt bordering on pity for someone like me who clearly has no sense of beauty, or integrity, really. :-)

nurettin · on May 6, 2020

If you read the examples, they aren't weird recursive template hacks, you just prefix some code with @meta and it goes to compile time.

phekunde · on May 6, 2020

> But, I think this trend might lead to extremely hard to read code, and there is a good chance that this hard-to-read code will be treated as some black box/voodoo.

Boost libraries fall into this category.

neutronicus · on May 6, 2020

Code readability is one issue, build times another.

Chandler Carruth had a nice talk at CppCon 2019 about how widespread usage of protobuf wound up causing the Compiler to time out on single translation units in the Google code base.