Compile-Time Evaluation in the Zig System Programming Language

charlieflowers · on March 1, 2017

Nice, interesting. Reminds me a bit of Terra, which is the most impressing programming language concept I've ever seen (if only it were ready for prime time usage). http://terralang.org/

chubot · on March 1, 2017

Yeah I really like Terra. One thing I've been wondering is if it can be done with C or C++.

In other words, could you replace the C preprocessor / C++ metaprogramming with something like Lua? That is, have Lua/C instead of Lua/Terra.

That would be adopted a lot faster than a brand new language. I think one problem as always is parsing C++, but that should be doable with Clang now.

EDIT: Also, what about it is not ready for prime time?

jacobush · on March 1, 2017

Not convinced... the C preprocessor is tolerated because it is so small.

Lua + C would really confuse at least me. But nothing's really stopping you I guess. Lua, or any script-ish language could be used instead of the C preprocessor. Why not Python? Perl?

makapuf · on March 1, 2017

something I never understood is why not compile-time C semantics and proprocessor and actually enforce some compiler optimizations ?

This is basically the direction of constexpr in fact with C++.

I would love a new, slightly fixed/incompatible C17 (not D or subpart of C++, just C11) with

- c++17-like constexpr (and its uses anywhere : switch cases, array sizes.. ) - deprecated oldstyle args, undeclared functions, trigraphs - (templates can be mostly done with pure inlining+constexpr) - better keywords (static -> private (by default/explicit public), persist ) - fixed operators precedence (1<<3 +1=16 yay) - typed enums with explicit storage class - reduce preprocessor ability(discourage and just generate from lua/python/C programs if you really need, maybe with a shebang-like line). replace conditional compilation with constexpr ifs,

ndh2 · on March 1, 2017

You already answered the question: Because it's small.

jacobush · on March 1, 2017

Yes I agree

shakna · on March 2, 2017

M4 comes with most Linux distributions. Similar to the cpp, but recurzive with a couple extra features. It's just a little painful to use.

mikulas_florek · on March 1, 2017

That compile-time concept is great. I've wanted C++ to have something like that for a long time. The only problem is that there is no way to debug this compile-time code the same way one can debug run-time code. If there is an error, it produces static message which is hard to understand in more complicated stuff. You can see this in complicated C++ templates.

dom96 · on March 1, 2017

I'm not sure about Zig, but in Nim compile-time code produces a stack trace just like run-time code. But perhaps you are talking about debugging using a debugger like GDB? That is indeed not supported. But using a simple `print` statement is enough in most cases for debugging.

jokr004 · on March 1, 2017

Yeah he's definitely talking about a full fledged debugger like GDB. For the most part yeah, some print statements will get the job done, but that's a serious pain in the ass if you've got a ton of variables that you need to look at. Significantly simpler to just be able to look at everything at once. Also, I'm unfamiliar with Zig, but is it even possible to print anything from your code during compilation? Seems like you wouldn't be able to do that.

billsix · on March 1, 2017

In libbug you can do anything at compile time. Open and write to files, run a test suite, anything. http://billsix.github.io/cac.html

jokr004 · on March 1, 2017

Very cool!

mikulas_florek · on March 2, 2017

Yes, I was talking about full-fledged debugger. Even VC++ quite often prints the whole stack when there is an error in template code, but the messages can be tens of lines long, and sometimes do not contain all the necessary information. This and long compile times are the reason why many people tend to avoid templates and template metaprogramming.

quelltext · on March 1, 2017

It shouldn't be too hard to incoroprate the debugging of compile time computation.

mikulas_florek · on March 2, 2017

Indeed, yet nobody does that.

toolslive · on March 1, 2017

Two remarks:

- sadly, nobody mentions BET MetaOCaml http://okmij.org/ftp/ML/MetaOCaml.html which is probably first place you want to look before you try to build this yourself.

- in Lisp, this is trivial (as explained here : http://www.ymeme.com/hackers-introduction-partial-evaluation... )

abecedarius · on March 1, 2017

Aside: as the author of the second link I'd rather you pointed to http://wry.me/~darius/writings/peval/ (though I'm grateful to ymeme for posting it without permission, since the original host fell off the web.)

Of course, Lisp isn't really a systems language like Zig -- I think it's great that it's tackling the problem its own way.

kazinator · on March 1, 2017

Yes; rather, a systems language like Zig is an activity you can do in Lisp.

qznc · on March 1, 2017

Why does the function declare which parameters are "comptime"?

Recently, I was thinking about a language, where the user could decide which parameters to make comptime. Examples:

Normal function call: foo(a, b)

Function call with comptime: foo(comptime 42, b)

What actually happens at runtime: fooVariant(b)

The compiler makes a duplicate of foo, where the first parameter is removed and the constant 42 used instead. Compiler writers may know this as the transparent "procedure cloning" optimization.

This increases flexibility, because the writer of a function does not have to decide which parameters are comptime and which are not. This sometimes leads to duplications, for example with regular expressions, where the pattern is very often comptime, but not always.

skissane · on March 1, 2017

One issue I see is how far to propagate the comptime?

If I have one complex function which calls 50 other functions, do I clone all of them too? And what about functions called indirectly? The amount of cloning could become massive, and it might lead to worse performance (e.g. so much code is cloned that the code no longer fits in the CPU caches.)

Conversely, if I am only cloning the function I am calling, then that might not be a very useful feature, because the function I am calling might do very little – it might just be a thin wrapper around another function.

The advantage of doing it on the function parameter declaration is the function author has to work out the propagation aspect themselves (by deciding whether to make the arguments of the indirectly called functions comptime or not.)

ttd · on March 1, 2017

Something similar is "partial evaluation", and LLVM (for example) supports it via a component called LLPE [1,2].

One of its main drawbacks is it falls under the category of interprocedural optimization (as opposed to intraprocedural) which can quickly balloon into a practically intractable problem for large programs.

Also, it's just difficult to imagine it working in many non-cherry-picked examples. You have to prove that a given parameter is constant over all callsites to a function, which can be difficult e.g. in the presence of indirect/virtual function calls.

[1] http://www.llpe.org/ [2] http://llvm.org/devmtg/2015-04/slides/LLPE.pdf

DannyBee · on March 1, 2017

"Why does the function declare which parameters are "comptime"? "

Because it's at least exponential time and space otherwise, and sometimes much more :)

(becasue you may have to duplicate the entire call path rooted at foo)

throwaway7645 · on March 1, 2017

I like the simplicity of the website and the authors dedication to releasing both updates & informative articles.

My question (I'm not a systems coder), is how is this really better than C? How can it be even remotely as efficient as a compiler with decades of optimization witchcraft and sorcery? Is it really that much easier to write? I'd be curious in seeing a small 1/2 page code sample difference and seeing benchmarks on performance, compile times...etc. Nim normally jumps to mind as an easy to write language that transpiles to C and is therefore usually fairly equivalent, but is almost as easy to write as python. How does Zig fit in? Also, good luck on the project!

comex · on March 1, 2017

> How can it be even remotely as efficient as a compiler with decades of optimization witchcraft and sorcery?

This is just a frontend. The optimization witchcraft and sorcery all lives in the middle/backend, which in this case is LLVM. LLVM can do a decent job optimizing pretty much anything you throw at it, though broadly speaking it works better the more the IR looks like C. For example, Rust does a lot of moves of large structures, unlike C/C++; IIRC pcwalton submitted some optimizations involving those to LLVM, which are language-neutral but which nobody had bothered to implement before because C/C++ didn't need them as much.

throwaway7645 · on March 1, 2017

Ah didn't know it was using LLVM...thanks.

setr · on March 1, 2017

>How can it be even remotely as efficient as a compiler with decades of optimization witchcraft and sorcery

I don't know enough to say anything myself, but i believe it mostly boils down to C allowing too much freedom, with tok few constructs, with too little communication between programmer and compiler.

The C compiler must derive certain facts from a given block of code to optimize it in a particular fashion; and the derivation must be correct for all edge cases. The witchcraft and sorecery is in these derivations, and handling pathological edge cases.

AFAIK these C-alt languages tend to skirt this problem by adding language constructs that make certain promises to the compiler, like Zig's comptime attribute and rust's borrow checker. They catch up to C, at least for some common classes of problems, by simply giving the programmer more expressive power.

And then, I imagine, they have an easier time taking some sequence of these (stronger) promises and combining them into further sets of optimizations.

Which is also what "higher level" languages do (more by limiting freedom of action, and increasing expressive ability, than communication with the compiler) and how you get ideas like haskell being as fast/faster than C code in the common case (but won't match handwritten optimized C). A Haskell compiler can make bigger steps more safely than a C compiler can, simply because the language offers more/bigger guarantees with each construct.

astrange · on March 1, 2017

Haskell isn't really faster than C - that is, once you know the problem I could write a C program faster[1] than yours and it wouldn't be "carefully handwritten optimized C" as much as "kind of casually handwritten C".

(If this is an object-manipulating program and not a numeric one, substitute another language so I don't have do memory management.)

The issue is that magic high-level-compiling compilers can't be developed because nobody has time to wait for their compile to finish. This is also why JIT never really optimizes very hard.

[1] or more importantly, less memory-using and live profiling tools and gdb will work.

joosters · on March 1, 2017

Plenty of people would love to give their compiler time to optimise, in fact they already do, with hours-long build times of some projects. I don't think compile time is the thing holding back compiler development, after all, every time-consuming optimisation can be put behind a command-line switch to allow those who want it to use it, and those who want a fast compile can go without.

astrange · on March 3, 2017

I think you'll run into a problem where your optimizations are time-consuming but not actually productive.

When you're going low-level like in C, optimizations are famously useless on x86 - you can barely tell if most programs are running at O0 or O3.

Going higher level, they should help, but it's really an interactive process so a fast compile/run cycle is still important. You want to know how well the optimizer understands your program and which idioms are safe to use.

Kutta · on March 1, 2017

Obviously, we should not compare small personal project languages with decades-old industry languages by factors determined by the sheer amount of work thrown in. We compare them mostly by fundamental design principles.

In this case, partial evaluation is a massively more principled and efficient way of doing metaprogramming than preprocessing or macros. There are limitations compared to Lisp-style unrestricted untyped metaprogramming, but not any that I would much miss in practice.

With partial evaluation, we automatically get type and scope safety, and also only need one language instead of several. Optimizations also cascade in a principled way: we can compile code-generating code to fast machine code, which we then use to generate faster code or generate it faster, but the then-resulting code can be also used to generate code, and so on.

ant6n · on March 1, 2017

At the same time, Zig is basically only a thin layer on top of LLVM. Since it's a low-level language itself, it can get away with having close to no static flow analysis beyond this partial evaluation (and maybe types?)

throwaway7645 · on March 1, 2017

I knew I was missing something. Thank you.

panic · on March 1, 2017

How can it be even remotely as efficient as a compiler with decades of optimization witchcraft and sorcery?

Using LLVM (which Zig does) gives you access to the same middle-end and back-end optimizations as clang.

the_duke · on March 1, 2017

It's refreshing to see another attempt at a systems language that sheds the verbosity of C++, D and Rust.

Nim, C++, D (and Rust in a very limited fashion) all have compile time function evaluation, but not remotely with such a low overhead syntax.

p0nce · on March 1, 2017

D has no overhead syntax for CTFE, "comptime" is from the context.

dom96 · on March 1, 2017

What do you mean by "low overhead syntax"? I would call Nim's syntax simple and consider it to be "low overhead" so I'm not sure it belongs in your list.

throwaway7645 · on March 1, 2017

Agreed from at least a beginners' perspective. I think there are 70 reserved keywords, but I can't imagine using 1/2 in practice. Most of those are just things like int8, int16, uint8...etc.

bachmeier · on March 1, 2017

> Nim, C++, D (and Rust in a very limited fashion) all have compile time function evaluation, but not remotely with such a low overhead syntax.

What is high overhead about D's syntax for compile time function evaluation?

ndh2 · on March 1, 2017

I was wondering why `inline if` would be necessary. Apparently it's already been removed: https://github.com/andrewrk/zig/commit/b78c91951a1db34c615a0...

a-nikolaev · on March 1, 2017

Explicit `inline if` could give a compile-time error if the condition was not comptime.

pjmlp · on March 1, 2017

Regarding C's printf example, "__attribute__ ((format (printf, x, y)))" is gcc idiom, a C compliant compiler is not required to provide any kind of support against format string errors.

doomrobo · on March 1, 2017

Very cool. But the more I see of compile-time code the more I think it's a bad idea. The presented "toy" function with the inline while loop is essentially a macro that's doing compile-time code generation. I can easily see myself calling that function with a non-constant value and expecting it to compile (after all, functions are functions and they take in values regardless of their token type). Conflating the two runtimes seems like it could be frustrating. In a lot of cases you also get the same benefits (in this case, monomorphization) automatically due to compiler optimizations. Also I think most people working in production environments would have some sort of expectation about the time it takes to compile a library in Zig, but allowing arbitrary compile-time computation means that they cannot even guarantee that the compilation will halt!

jonathanstrange · on March 1, 2017

I think the right model for compile-time evaluation is to have an interpreter and phases, making the full language available at compile-time but clearly distinguishing between compile-time, macro-expansion, module elaboration and runtime phases. Scheme dialects generally do that, although they should maybe define module visiting and elaboration orders in general more clearly like e.g. Ada does.

I don't know if Zig does that. Anyway, having different phases is not conceptually hard, although it can bite you a bit with Macros in Scheme.

billsix · on March 1, 2017

In my project, http://billsix.github.io/cac.html, the Compile-Time language is the same as the runtime language. Tests are collocated with procedure definitions, but evaluate only at Compile-Time.

pg314 · on March 1, 2017

Arbitrary compile time evaluations in the form of macros have been around in Lisp since 1963 [1]. I'm sure that pretty much every Lisp programmer thinks macros are a great idea. If you want to learn more about them, I recommend Practical Common Lisp (a great intro) [2] or On Lisp (requires prior Common Lisp knowledge) [3].

[1] ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-057.pdf

[2] http://www.gigamonkeys.com/book/

[3] http://www.paulgraham.com/onlisptext.html

mapcars · on March 1, 2017

It's so strange to see people discussing cool new ideas, which are half-century old. Seems like we need history lessons for programmers specifically.

ben-schaaf · on March 1, 2017

The templating systems in C++ and D are Turing complete. Apparently Rust's type system is also Turning complete.

gpderetta · on March 1, 2017

The type system of every (statically typed) programming language attempts to expand until it becomes Turing complete. Those languages which cannot so expand are replaced by ones which can.

With apologies to jwz.

jonathanstrange · on March 1, 2017

This is a very interesting language, but the documentation on its main page is sparse. The highlighted features leave a lot open. I wonder how exactly it compares to C, and what's it's relation in terms of safety to Ada and Rust. More importantly, what kind of memory management does it use? Does it have a GC?

If it has a GC, I'd find it even more interesting...

naasking · on March 1, 2017

I wondered about memory management too considering this language is supposed to "prioritize safety". Memory is manually managed though, which you can see from their HashMap example on the website.

The description indeed leaves a lot open to interpretation. They should have a brief overview that uses some standard terminology to describe Zig.

Sean1708 · on March 1, 2017

That's when using a custom allocator though, browsing through the other examples it doesn't look like there's any manual memory management going on but that could just be the examples chosen.

naasking · on March 1, 2017

But it does an explicit free operation, and no mention of preventing double-frees, which to me implies it's unsafe.

comex · on March 1, 2017

Regarding Rust:

> There is another thing I noticed, and I hope someone from the Rust community can correct me if I'm wrong, but it looks like Rust also special cased format_args! in the compiler by making it a built-in. If my understanding is correct, this would make Zig stand out as the only language of the three mentioned here which does not special case string formatting in the compiler and instead exposes enough power to accomplish this task in userland.

It depends what you mean by "userland". Rust has long supported procedural macros, albeit not on stable compiler versions until recently (and still only hackily there). Procedural macros, a.k.a. compiler plugins, are written in Rust, but work differently from compile-time function evaluation. With CTFE, the code running at compile time lives in the same "world" as the rest of the code being compiled, in a sort of pretend version of the target environment where you can't do things like call external functions. Rust compiler plugins are just separate libraries that get compiled for the host architecture and loaded into the compiler process (which is less scary than it sounds; the build process "just works" thanks to Cargo, and Rust's safety ensures you don't stomp on random compiler state). You have access to the full language and can call whatever you want, even do things like file I/O if needed; on the other hand, you're in a separate environment and can't just name variables, functions, etc. from the runtime world.

Currently, Rust compiler plugins are purely syntactic: you can look at the tokens of the macro invocation and that's it. You can't, say, ask the compiler for the definition of a type or the value of a named constant; plugins are executed too early for that. I think it'd be nice if that changed in the future, but it would be hard to implement without giving up the ability to emit arbitrary syntax as the output of the macro; there would be trouble with circular dependencies (think type inference).

But what currently exists is enough to implement safe printf, in conjunction with Rust's type system. format_args! is built into the compiler, so it could cheat, but it doesn't; like user-defined macros it's purely syntactic. The following Rust code:

    println!("a={} b={}", a, b);

expands to something like this (I've simplified it a bit):

    _print(Arguments::new_v1(&["a=", " b=", "\n"],
                             &[ArgumentV1::new(&a, Display::fmt),
                               ArgumentV1::new(&b, Display::fmt)]));

format_args! parses the format string and splits it up, and separates the arguments into a series of calls, without needing to know the types of a and b. ArgumentV1::new is just a regular generic function that can be called with any type implementing the Display trait. (Actually, it depends on the second argument, which is a function, but the details aren't important; there's nothing particularly special happening.)

Arguably it's nicer to be able to know the types of the arguments within the format parsing routine, and other use cases require that kind of feedback, but Rust's approach works pretty well here.

By the way, Rust also has a Turing-complete type system and a rudimentary form of compile-time function evaluation, which in the future might be powerful enough to express code similar to the Zig implementation, but for now it's not really designed for that sort of thing.

Sidenote: While C's metaprogramming is rather limited, C++14 CTFE is pretty powerful and you can parse format strings at compile time there too:

https://akrzemi1.wordpress.com/2011/05/11/parsing-strings-at...

blobman · on March 1, 2017

How did andrewrk disable allowing people to follow him on github? :-o

laurent123456 · on March 1, 2017

It doesn't look like he disabled it, I can follow him.