
Cog: Use pieces of Python code as generators in your source files - ingve
https://nedbatchelder.com/code/cog/
======
xg15
To be honest, what I don't like about this is that it once again operates on
the character level. I feel this brings us back to all the issues we had with
the C preprocessor and, in addition, makes any IDE analysis/assistance hard to
impossible.

I feel the tool would be more useful if you could process the target
language's AST instead. This would result in hygienic macros as well as making
the code easier to analyse (and might solve the whitespace problems as well,
as a formatter could render the whole tree in the end, after any code
generation was already applied)

~~~
anon4242
What you want is something akin to Terra:
[http://terralang.org/](http://terralang.org/)

~~~
xg15
Didn't know about that yet. That indeed looks interesting. Thanks a lot!

~~~
sitkack
[https://github.com/pallene-lang/pallene](https://github.com/pallene-
lang/pallene) [https://github.com/titan-lang/titan](https://github.com/titan-
lang/titan)

------
fphilipe
The Swift code base uses something similar that the core team wrote: GYB,
Generate Your Boilerplate. It’s used to generate several variants of similar
code that would be cumbersome to maintain otherwise.

[https://github.com/apple/swift/blob/master/utils/gyb.py](https://github.com/apple/swift/blob/master/utils/gyb.py)

~~~
wmu
> GYB, Generate Your Boilerplate

Love the name :)

------
Bjartr
I use this extensively in a production codebase to help facilitate keeping
things DRY across multiple languages/filetypes (java, xml, less, html) while
not locking me into a framework since, at the end of the day, if I want to
stop using cog, I'm still left with completely normal code.

I've layered a kind of DSL (more Python in comments with a different marker)
on top of cog so multiple files can reference the same metadata (domain model
fields in my case) when doing the codegen.

------
CoolGuySteve
Having worked with script generated C++ in the past, this looks annoying as
hell to debug.

Whereas macros and templates have compiler support to give line numbers inside
the macro/template, generated code errors have an extra step of having to look
at the C++, find the error in the generator, rinse, repeat.

Lambdas are better, but if you have to repeat yourself in a way that's too
syntactically weird for a template or lambda, you can "#define MY_MACRO(...)"
and end it with "#undef MY_MACRO" to keep the namespace clean.

~~~
ironmagma
Funny you say that, as we use Cog to avoid having to deal with C++ templates
and their associated pains. The nice thing about Cog is that it operates like
any Linux command-line tool, just at the text level, and as a result, if
something has gone wrong at the compilation phase, you can see what was fed
into the compiler to see what exactly was generated. Interpreting C++ template
error outputs is an art in and of itself.

------
gh02t
This is cool and I could see myself using this, but I wonder why it's
necessary to `import cog` in the examples? Seems like it'd be better to just
include cog implicitly in the namespace by design for these snippets since
you're practically always going to use it.

~~~
philipov
Explicit is better than implicit. Automatically importing it makes the code
look like magic.

~~~
nedbat
As it happens, "cog" is also implicitly imported, though tbh I forget why....!

~~~
gh02t
Awesome, that makes a lot more sense to me. I'm a long-time fan and reader,
keep up the great work!

------
monocasa
Oh geeze, this reminds me of the people who use perl to autogen a bunch of
verilog (which is super common for whatever reason).

Just don't, instead find a macro system that actually understands your
language.

~~~
nedbat
What macro system should I use if I want to define a data schema in one place,
and then generate C code and SQL code from it?

------
NateDad
I wrote a generic version of cog that can use any language as the generator
code. It's called gocog, because it's written in go, but once compiled, it's a
static binary, and you don't need go on the host machine.

[https://github.com/natefinch/gocog](https://github.com/natefinch/gocog)

It's directly built off of cog's ideas and mimics much of cog's interface. (I
worked with Ned, cog's author back in the day, and really enjoyed having cog
to write boilerplate for me).

gocog is some of the first code I wrote in Go, so it's not super pretty code,
but it's a very useful little tool for generating boilerplate.

------
pcr910303
This is better in a sense that the user doesn't need to learn something new
except for the language the user is programming in and python (even though
theoretically manipulating the AST will be less error prone, etc...). I use
javascript everywhere, but I still didn't learn how to make babel
plugins/macros because copy/pasting snippets of code two or three times is
easier than learning. It's a pity that people still couldn't make a language
that has a super intuitive macro system like lisp (homoiconicity, the AST is
the language), and a intuitive syntax like python. I actually believe that
this is partly because most Lisp users don't like the idea of new syntax and
that all major lisps (CL, Clojure, Scheme) doesn't have syntax sugar as
default. I would appreciate if a new CL tutorial appears that uses infix
notation with reader macros(`#I`) or a Clojure tutorial that uses the infix
package([https://github.com/rm-hull/infix](https://github.com/rm-hull/infix)).
It will be great to beginners because 1. they wouldn't be scared of prefix
notation and 2. it shows (a part of) what lisp macros can do (introduce syntax
sugar in a way that is natural to the language).

------
phodge
I built something like this in ~2006 called "PHPinPHP" because I wanted to
generate PHP classes from my mysql schema. It even used "[[[...]]]" blocks
like cog does.

I eventually realized that A) generated code is completely unmaintainable; and
B) the reason I thought I needed code generation is because my base language
wasn't flexible enough.

Later on I switched to python and haven't yet hit a problem that I need code
generation to solve.

------
wodenokoto
So basically what you want out of your jinja templates in python.

Does it handle line numbers correctly on errors?

~~~
nedbat
Python errors in your Cog-generator will report the correct file and line
number in the larger containing file.

------
chriswarbo
Reminds me of lips (
[https://github.com/zc1036/lips](https://github.com/zc1036/lips) and
[https://github.com/rbryan/guile-lips](https://github.com/rbryan/guile-lips)
).

Cog looks a little over-the-top for my purposes, but still looks saner than M4
;)

------
zwegner
I wrote something very similar some years back:
[https://github.com/zwegner/prethon/](https://github.com/zwegner/prethon/)

I can't remember if I saw Cog first, and wrote a different version that fit my
needs better, or if I only found Cog afterwards...

It uses a more PHP-esque syntax for inserting Python code. It has inline
expression syntax, and quote functions, which IMO make it nicer than Cog for
using as a code preprocessor--it's easy to make e.g. function specializations,
or loop over code blocks. It's not very well documented though, and is
probably missing some nice features.

------
conistonwater
Why does this need to be a standalone tool? I can already do this in Emacs by
pasting emacs-lisp code snippets and executing then while editing the file,
inserting the output into it. Do other editors not have this feature?

~~~
anschwa
Not to mention using "C-u M-|" (shell-command-on-region).

~~~
nedbat
Have you actually looked at what cog does? I don't see how you would manually
use Emacs commands to do the same more than once.

~~~
anschwa
I'm not exactly sure I understand the use-case, but as far as the example used
in the article, shell-command-on-region accomplishes the same kind of thing.
Why not leave the "generation code" as a comment in your file the same way cog
does?

~~~
PurpleRamen
Template-generators like cog are meant to run periodically, for example every
time you 'compile' your project. Often they contain dynamic elements which can
change between each run.

Using your emacs-command would defeat that purpose, because you would need to
search the region at every run and re-execute it manually again and again and
again. And you would need to documentate the command anyway, because nobody
can remember all those regions. So why not automate this task then?

------
empath75
Seems like most languages where you'd use this already have macros?

~~~
nedbat
C++ macros can't read a configuration file to generate code (at least I don't
want to know that they can!). And Cog works in any text file, so it can be
used for languages (like HTML) that don't have macros.

~~~
a1369209993
They _kind of_ can, via #include, but the configuration file has to be in a
particular format and you're limited in what you can do with it.

------
radarsat1
One use case for this could be dumping generated algorithms from sympy. I was
doing some constraint programming and ended up almost writing something very
similar albeit poorly and ad-hoc, by generating .c files that I #included into
other .c files, it was very messy. The use case was to write some mathematical
relations and generate C functions to calculate their differentials. It was a
lot of manual copy-pasting until I came up with the #include trick, but this
would have been better.

------
Noumenon72
Your example could do with some syntax highlighting. There's so much ugly
punctuation in it I didn't even notice the actual code at the end. Plus my
first impression was that I would never put something so unreadable in my
source, whereas in a nice green comment I wouldn't care so much.

In Java I solved the problem of whitespace by just running my result through
google-java-format, but I see how Python's offside rule would make that
totally impossible.

~~~
nedbat
Good idea: i added syntax highlighting to the first example on the page.

------
lf-non
InGenR [1] is a similar utility I wrote sometime back.

It is similar to cog in that generated code resides alongside the source code.

Some differences/advantages are:

1\. Generators can be pure JavaScript or declarative dot templates.

2\. The generators can be distributed as npm packages as the generators are
resolved through npm's require resolution.

[1] [https://github.com/lorefnon/InGenR](https://github.com/lorefnon/InGenR)

------
kwhitefoot
Interesting stuff. Quite a while ago I wrote something vaguely similar using
JScript to make a demo of a sort of 'mathematical' document editor as part of
the VB Classic Wikibook, see
[https://en.wikibooks.org/wiki/Visual_Basic/JArithmetic](https://en.wikibooks.org/wiki/Visual_Basic/JArithmetic)

Might be interesting to revisit the idea.

------
cryptonector
I think jq would be a better DSL for this, not least because it's easy to
integrate libjq into C/C++/Rust programs.

I've also been thinking of building a trivial little library to use jq for
configuration files, where jq syntax is more convenient than JSON, and too
where you can always write or alter configuration objects using path-based
assignments, so you get to choose JSON-style or TOML-style.

------
nicoddemus
We use it at work to:

* Insert generated C++ and Python boilerplate code * Generate parametrized tests based on an external data files * Copy code from one place to another and keep them up-to-date * Pinning dependencies across multiple projects using a single source of truth

So this tool is immensely useful.

------
auscompgeek
I can't tell whether this is better or worse than header2whatever (piping C++
headers into Jinja2):
[https://github.com/virtuald/header2whatever](https://github.com/virtuald/header2whatever)

------
stephenbennyhat
Like erb ([https://ruby-
doc.org/stdlib-2.6.2/libdoc/erb/rdoc/ERB.html](https://ruby-
doc.org/stdlib-2.6.2/libdoc/erb/rdoc/ERB.html)).

------
LeonB
i've been using a similar thing, a pre-processor written in powershell, that
lets me embed a few different languages in any type of file.

i've used it for writing books, and interactive checklists, and for generating
static html websites.

this is the one i implemented:
[https://github.com/secretGeek/pre](https://github.com/secretGeek/pre)

------
jpochtar
I hear LISP has a pretty good macro system...

------
carapace
It's like PHP but Python, eh?

~~~
klyrs
My thought exactly... Python Hypertext Preprocessor...

------
okaleniuk
Or you can generate pieces of code with non-embedded Python scripts at the
earliest stage of make and inline them with the host language's preprocessor.

This way you'll have the same functionality but with standard tooling for
every language. This means conventional debugging, static analysis, testing
etc.

And no extra dependencies, too.

~~~
nedbat
I'm not sure what you mean by "standard tooling for every language." Cog will
work with any text file. Or do you mean that cog itself is non-standard?

~~~
okaleniuk
No-no, I mean if generating code is in a separate Python file, then I can
debug it with pdb, I can profile it with cProfile, I can run Pylint on it, -
all the standard tools.

I like the idea of everything sharing the same file, too, but it does make
working with Python part a bit more difficult.

Also, with keeping it separate, the "every language" part comes up. It doesn't
have to by Python if it's a separate code generator. Whatever suits you will
work. You can generate code for C in C++. Or assembly in Common Lisp.
Everything in anything.

------
hguhghuff
I’d be interested to hear any use cases that people could imagine for this.

~~~
kingosticks
We have something at work that's very similar but perl-based for our VHDL
source. It saves a lot of boilerplate for conversion functions, null types,
read/write cpu functions etc. Vhdl has pretty awful templating so it's really
useful, you just need to use it sparingly else it becomes unreadable very
quickly.

~~~
InitialLastName
I do this using cog for Verilog code. It lets me generate signal processing
constants/math in the signal processing code itself, driven by a single
control file (with, i.e. system sample rate) without having to rely on
Verilog's awkward (where even available) math.

------
Eli_P
So, it's like _m4_ macro processor which author used as XSLT. Could be a good
call today if Cog had access to python's AST instead of plain text and
interacted with Swagger/OpenAPI or something, I mean tools like _autorest_.

