
ZZ is a modern formally provable dialect of C - the_duke
https://github.com/aep/zz
======
MaxBarraclough
I find the basic idea of this project to be very compelling - I was thinking
aloud on HN recently and arrived at roughly the idea this project is
implementing. [0]

With that said, I really dislike the way they're describing their project.

When I read _safe dialect of C_ , I first assumed they meant they had
developed a safe subset of C, or perhaps a very similar language, like OpenCL
C [1]. Instead, they developed a new language which isn't C at all. Nothing
wrong with that, but if I don't instantly recognise the syntax as C, I
wouldn't call it a C dialect.

They also put _formally provable dialect of C_. Their language compiles to C
code which is guaranteed to be free from undefined behaviour. This is not the
same thing as a language where hard guarantees can be made about program
behaviour, such as SPARK [2] or Dafny [3].

If the authors are reading this, I urge you to improve your project summary.
Your language does not allow me to prove program correctness, instead it
protects me from C's undefined behaviour. That's still a great idea! Please
make this clear!

[0]
[https://news.ycombinator.com/item?id=22102658](https://news.ycombinator.com/item?id=22102658)

[1]
[https://en.wikipedia.org/wiki/OpenCL#OpenCL_C_language](https://en.wikipedia.org/wiki/OpenCL#OpenCL_C_language)

[2]
[https://en.wikipedia.org/wiki/SPARK_(programming_language)](https://en.wikipedia.org/wiki/SPARK_\(programming_language\))

[3] [https://en.wikipedia.org/wiki/Dafny](https://en.wikipedia.org/wiki/Dafny)

~~~
pickdenis
> Your language does not allow me to prove program correctness

Isn't that the whole point of the SMT solver?

What about this example (that doesn't compile)?

    
    
      fn bla(int a) -> int
          model return == 2 * a
      {
          return a * a;
      }
    

Isn't that verifying program correctness? A sibling of this comment claims
that "the only thing proven is memory access validity" but, again, this
example takes that down.

~~~
MaxBarraclough
I agree that it is. Looking at the page, I can't see how far this goes.

Was my earlier comment completely wrong? Does ZZ allow the programmer to
express a formal specification, e.g. to verify a sort function? If so, their
examples are selling their language very short.

~~~
a3p
Prove of algorithms is possible as long as there's a known method of doing so
in SMT. That means in practice, if someone has written a paper for formally
proving an algorithm in SMT, you can mostly copy paste the proof.

zz is developed in parallel with a large project using zz and new syntax sugar
features will surface slowly as they become practically useful.

That being said, it will never replace external formal verification with
something like coq. They serve a different purpose.

~~~
loeg
What's the large project using ZZ? Or is it behind closed doors? This appears
to be an entirely different ZZ programming language:
[https://scratch.mit.edu/discuss/topic/80752/](https://scratch.mit.edu/discuss/topic/80752/)

~~~
a3p
it is [https://devguard.io/](https://devguard.io/) which is being rewritten
from rust to ZZ in this branch
[https://github.com/devguardio/carrier/tree/zz](https://github.com/devguardio/carrier/tree/zz)

~~~
twsted
Very interesting.

Can you tell us a little more about the reasons of the rewrite from rust?

~~~
loeg
I'm not involved, but one obvious answer is: broader portability than Rust
(this is explicitly called out in the ZZ article). Clearly devguard is
targetting a broader set of devices than the limited set Rust currently
targets (x86; arm, mips, riscv in tier 2, with various caveats for bare metal
targets).

------
scoutt
> where we still program C out of desperation

I agree it's the standard and the only _thing that actually works_ (author's
words), but it's still a pleasure for me to write and have to deal with C (for
embedded). I'd be desperate if I have to be forced to deal with _huge_
different paradigms because _pointer problems_ or _insert-your-C-rant-here_. C
is not going to be replaced on embedded any moment soon.

~~~
dmos62
> C is not going to be replaced on embedded any moment soon.

Why? Doesn't for example Rust without stdlib already cover the use cases? Note
I'm not experienced in embedded.

~~~
scoutt
My personal opinion, is basically for 3 reasons:

1) There is no need to. C already has all we need to build our systems. The
rest is seen as overhead/over-complication.

Regarding Rust, I try to keep up to date about its progress. Unfortunately
most of the diseases that Rust _cures_ are not much of a trouble in embedded.
My last UB, memory leak of loose pointer happened years ago. It will happen
again and when it happens, I'll debug it. That's it. I'm not afraid of UB or
dealing with pointers, even if there are 10K Rust users trying to FUD me. I
know what I'm doing. Every embedded/kernel/driver developer know what they are
doing. When shit happens, that's it, no big deal. You plug your debugger and
solve the issue. It's not a nightmare that chases us in the middle of the day.

FUD alone is not enough for switching. So, why should I start thinking that a
variable cannot change, because it's a constant (wasn't it a variable?),
unless it can mutate, so it's a constant variable that can change because now
is mutable?

Or constrain myself into borrow-checker torture for a thing I can do in a
couple of instructions?

 _WHY?_

2) This is not about a _language problem_ , is about solving a _programmer
problem with language_. If C = math, then you cannot do math _simpler /better_
because today's mathematicians are sloppier. Or because bosses pressures
people to deliver crappy products.

I'm not a genius. I'm far from it, and if any seasoned C programmer challenges
me I'll probably run away. But embedded/kernel/driver development is harsh, so
if a developer thinks that he/she cannot make it because _language_ , then
it's mostly about searching for an excuse. Time to change jobs.

The key is to think that a lot of people did (and does) a lot with so much
less, for 40 years now. It's not a language problem. People have to learn to
deal with it.

I was there too 20 years ago, when every C++ developer was afraid that they
would lose their jobs because C#. It never happened. C++ is still one of the
most used languages.

3) In my case, there are official libraries from manufacturers you have to
use. Sometimes receiving customer support depends on if and how you use those
libraries. All those libraries are in C. All the support is in C. All the
examples are in C.

Yes, I know there is that engineer that has a Github repo with a library that
works fine with that STM32 for that specific language, that now is getting
support for embedded so in 5 years we could _maybe_ put something in
production. But, not for now.

Sorry for the length. Edited some typos.

~~~
dx87
> Every embedded/kernel/driver developer know what they are doing.

Yet we still see security issues in all of those. I'm not saying that
everything should be re-written in a different language, but C developers
saying "I'm a good developer, all those safety mechanisms would hold me back"
doesn't hold water considering all the security vulnerabilities we see that
would have been prevented if they had used a language with better safeguards.

~~~
scoutt
> Yet we still see security issues in all of those

That's the typical excuse. That's the FUD I mentioned about. Bugs will keep
existing and so security issues, no matter the language you use.

~~~
hurrrrrrrr
But if we get rid of a whole class of bugs and vulnerabilities, the number of
bugs will go down, no?

~~~
scoutt
This is a common way of thinking about it, but nobody really knows. It's
purely a conjecture. There is no data to back it up, and it just appeals to
common sense.

Should everybody drop C/C++/whatever and rush into the Rust train because Rust
people has _conjecture_?

I ask the opposite question. What would happen if the only programming
language left is C? Wouldn't we become better programmers and raise the bar so
high that the bug count drops to 0?

~~~
whatshisface
> _Should everybody drop C /C++/whatever and rush into the Rust train because
> Rust people has conjecture?_

The way these things usually work in practice, the evidence that a new
paradigm improves things usually builds up slowly. There will never be a point
at which someone proves mathematically that C is obsolete. Instead, the gentle
advantages of other options will get stronger and stronger, and the
effectiveness of C programmers will slowly erode compared to their
competition. At first only the people who are really interested in technology
will switch, but eventually only the curmudgeons will be left, clinging to an
ineffective technology and using bad justifications to convince themselves
they aren't handicapped. Is Rust the thing that everyone except the
curmudgeons will eventually switch to? Who knows, but if you don't want to end
up behind the industry then it might pay to try it out in production to see
for yourself. If you don't make room for research and its attendant risks you
will inevitably fall behind.

~~~
scoutt
I'm sorry you see things in terms of _competition_ and _curmudgeons_. But I
see your point, it's just another type of FUD: don't stay behind and adopt
Rust because you'll be a curmudgeon.

~~~
whatshisface
Not exactly, I'm saying that if you stay behind and don't adopt _something_ ,
where that something is whatever the industry switches to after C, you will
eventually be left behind. Of course it is also possible (and likely for many
people) to die or retire before that happens. It's not like C is going away
any time soon.

------
pjc50
This is a great achievement in making formal methods accessible in pragmatic
terms - something that actually works and can be used by normal humans.

Would be nice to have an actual microcontroller example.

> The standard library is fully stack based and heap allocation is strongly
> discouraged

:/ \- I can see why this is done, as it's hard, so banning it to make the
problem tractable works. But it's also quite inconvenient. On the other hand,
"MISRA C:2004, 20.4 - Dynamic heap memory allocation shall not be used."

~~~
pjmlp
Unless it has changed, SPARK also forbids dynamic allocation.

~~~
yannickmoy
It has changed: [https://blog.adacore.com/pointer-based-data-structures-in-
sp...](https://blog.adacore.com/pointer-based-data-structures-in-spark)

------
loeg
Pre- and post-condition annotations (`where` and `model`, respectively) are
great. It's nice to see those in a new language.

Syntactically, this seems more like a Rust dialect than a C dialect. The
primary relation to C seems to be portability (transpiler) and integration
(ABI). This is true of most C-transpiled languages, though.

It's certainly cute, and potentially useful if your program is small enough to
be solved by a SAT solver. Ideally relatively quickly, or those compile times
will be poor. I wonder how it deals with machine registers, which are often
something you would be using in embedded C.

------
pjmlp
I though this would be more like Frama-C, in this spirit it doesn't look like
C to me.

So not sure about possible adoption among C devs, even when the idea looks
quite good.

~~~
akavel
Right, I wonder why they didn't build it with a C-like syntax?

------
socialdemocrat
Very cool idea! What I love about C is that all sorts of programming language
can talk to its ABI and that it fits well with low level programming.

However it is a cumbersome and unsafe language. This seems like a very nice
solution. You can write in a much safer language while producing C code which
does not look too alien relative to what you wrote

I have been interested in Rust but thinking it looks a tad too complicated.
This may be a happy inbetween.

~~~
simias
I'd expect that a formally provable language would be _more_ complicated to
write in practice than Rust. Take this example for instance:

>you must tell the compiler that accessing the array at position 2 is defined.
quick fix for this one:

    
    
        fn bla(int * a)
            where len(a) == 3
        {
            a[2];
        }
    

In Rust you don't need the where clause, the `a[2]` operation will just panic
at runtime if the array is too short. You don't have to prove to the language
that the access is correct.

~~~
firebacon
> operation will just panic at runtime if the array is too short

Exactly, rust doesn't provide a good solution to this problem at all. Panics
in rust are an escape hatch used to ensure the language stays "safe" in
situations where the compiler can not prove a given behaviour at compile time,
but where it would have made the language too ugly if you had to wire through
Result types for all the trivial operations like adding two numbers.

In my experience, panics in rust have been a major source of pain. In contrast
to an exception, which you can catch, a panic behaves more like an abort. At
least it has been that way in the past. Now, with a lot of libraries using
panics to signalize runtime errors, coding in rust has at some times felt like
I was using a bunch of badly written C libraries that internally call
"abort()" and kill the process when something goes wrong that would have been
totally handle-able without killing the whole process. That's the benefit of
using a safe language, right?

I think lately the rust "community" has become aware of this issue and IMO the
way things are going is that that panics, as they are designed, should
basically not be used. But, without proper exceptions, that brings you back to
the situation where an operation as trivial as adding two numbers either
produces a return code that must be explicitly checked or may silently fail
and produce an "undefined" result in some cases.

~~~
jcranmer
A panic in Rust nicely captures the category of errors which the programmer
asserts should be impossible, which generally cannot be recovered well from. A
panic is kind of catchable, but it only works in the sense of killing only
part of the process rather than the entire process (like abort does).

If errors are expected, then you should Result instead of a panic. If people
are using panics instead of Result for these kinds of errors, then the library
is wrong. I'm curious what examples you have where this is happening.

------
Jaxan
I like how they incorporate an SMT solver. They claim: “all code is proven”.
What does that mean? What is proven about the code? Absence of memory bugs, or
actual correctness of algorithms?

~~~
frabert
That the code is provably free of C undefined behavior

~~~
MaxBarraclough
Looking at the example code, [0] it's clear you're correct.

As I just rambled about in another comment in this thread, their project
summary isn't clear about this. Still a great idea for a language though.

[0]
[https://github.com/aep/zz/tree/master/examples/hello/src](https://github.com/aep/zz/tree/master/examples/hello/src)

------
iainmerrick
One question that isn’t obvious from the overview:

This language compiles to C and asserts that your program will never exhibit
undefined behavior; have they _proved that ZZ is correct_ , ie that it will
definitely never output C code that exhibits undefined behavior?

If you really care about correctness, that seems important. I absolutely love
the idea in general though.

~~~
pickdenis
This seems similar to bootstrapping theorem provers. Here is a HN discussion
on one:
[https://news.ycombinator.com/item?id=21358674](https://news.ycombinator.com/item?id=21358674)

------
simias
So it's more like a C transpiler than a dialect of C IMO (it effectively looks
more Rust-like than C-like). I'm not just saying that for nitpicking the
obviously vague definition of "dialect" but rather because it's a very
important feature IMO, and actually makes this project potentially more useful
to me.

I'd be very wary of switching my toolchain to an experimental one in
production, especially if I'm targeting some niche DSP. On the other hand
generating C and then compiling it as usual seems less of a hurdle to me. They
actually point that out in the intro but I thought it might be worth
mentioning it here.

Now I just quickly skimmed the readme but do they explain how they deal with
interfacing with standard, non-ZZ C code? I assume they need some sort of
"FFI" bindings like Rust to make the code safe.

~~~
rezeroed
_It will always emit C into a C compiler which will then emit the binary._

Presumably we can dump the C before compilation?

~~~
iainmerrick
Why?

For this purpose, C is a very convenient and very portable assembly language.
You could use a verified C compiler like CompCert to convert it to machine
code.

In an ideal world, there’d be no C compiler in the loop at all, sure. But in
practice it’s not causing any trouble at all in a ZZ -> C -> machine code
workflow. Quite the reverse, targeting C has some major benefits. (There are C
compilers that are very fast, very highly optimizing, very portable, and/or
verifiably correct.)

 _Edit to add:_ just realised I might have totally misunderstood your comment,
sorry! Apologies for jumping the gun if so.

If you just mean can we save the generated C to disk instead of compiling it,
yes, I would hope so too.

~~~
a3p
Absolutely! The zz export command just dumps the C and SMT code along with
makefiles for common build systems and stops there. Very handy for using it
within other toolchains.

------
VladimirGolovin
This is a very interesting idea. The world needs more languages that make
formal provability practical.

And I think I found a typo: thery is_open(int*) -> bool; -- should be
"theory".

------
_sbrk
> Its main use case is code close to hardware, where we still program C out of
> desperation, because nothing else actually works.

I think I'll stick with Ritchie's language over this fly-by-night invention.

------
je42
> Checking is done by executing your code in SSA form at compile time within a
> virtual machine in an SMT prover. None of the checks are emitted into
> runtime code.

Cool !

------
dooglius
VCC [0] was a project by Microsoft Research that was used to successfully
prove Hyper-V correct, but unfortunately it seems to have been abandoned.

[0] [https://github.com/microsoft/vcc](https://github.com/microsoft/vcc)

------
naasking
No mention of unions, discriminated or otherwise. Sums/unions are a pretty
important abstraction, and a lot of C's unsafety can sneak in if you don't
enforce discriminated unions.

------
baybal2
This is gold, if it will actually work.

I will cross my fingers, and try it.

------
rurban
Way cool. But in my testing yices blows away z3 by ~ factor 10. (see
solvecomp.sh in one of the example ssa dirs)

------
camgunz
Love this project. Super pragmatic (use an existing solver) and small scope
(no template functions).

------
asimpletune
I love this idea. Does anyone know what the compile-time performance is like?

------
waffle_ss
Is the name supposed to be a French dick reference like Coq?

[https://en.wiktionary.org/wiki/zizi#Noun](https://en.wiktionary.org/wiki/zizi#Noun)

------
mr_vile
Okay, it's provable... what value do really get out of it? isn't the whole
reason why we still use C because it gets so close to resolving (or at least
acknowledging) hardware or platform-level implementation problems? we can
continue to re-invent the language in isolation but you can't replace a
hardware-level programming language in this manner.

~~~
iainmerrick
It is surprising that the syntax is so different for a “dialect” of C, but
apart from that --

If the semantics are very close to C, so you can do basically all the same
stuff, but you _also_ get guarantees that your code will absolutely never hit
any undefined behavior, that’s _great_ and tremendously useful. It absolutely
could replace C in applications where C is still the most useful and practical
language.

It would resolve a couple of major headaches in existing C code: security bugs
caused by memory overflows (caused by using arrays or pointers in undefined
ways); and highly optimizing compilers doing weird things to your code, by
exploiting undefined edge cases.

If I know my code will _definitely not_ hit any undefined behavior, that gives
me a ton more confidence that it won’t have stupid buffer overflow bugs and I
won’t get mysterious errors on certain platforms.

------
DmitryOlshansky
It can’t be a dialect of a language if the compiler of said language doesn’t
compile it.

~~~
amitport
[https://en.wikipedia.org/wiki/Programming_language#Dialects,...](https://en.wikipedia.org/wiki/Programming_language#Dialects,_flavors_and_implementations)

Plenty of well-known dialects are not subsets of said language (/compiled by
its compiler).

------
marta_morena
This is just another symptom of why OpenSource often sucks. Instead of somehow
coordinating and focusing their efforts, everyone seems to need to start their
own spin off "inspired" by other projects.

When do people realize that building a new language is almost always going to
fail and only very very few languages ever reach anything close to adoption.

Instead of spending all this time writing your own doomed language, why not
try to contribute to a project like LLVM or Rust and add your provable subset
there?

The same goes for Linux, which is the paradigm of wasted efforts.

~~~
AnimalMuppet
Bjarne Stroustrup never expected C++ to be very widely used. He made it
anyway.

Alex Stepanov never thought that anyone would care about his ideas on generic
programming. He pursued them anyway. They became the STL.

True, building a new language is almost always going to fail. The problem is,
when someone starts working on a new language, _they don 't know if it's
doomed or not_. It is good that 1000 people try, because from that we get one
language that many people use, and 10 specialized languages that a few people
use, and 10 languages that nobody uses but future people steal some of the
ideas.

> The same goes for Linux, which is the paradigm of wasted efforts.

Um... what? Wasted because nobody uses it? Very much no. Wasted because it's a
duplication of what was there before? To some degree, yes. But not everything
in Linux was in Unix before it. And Unix couldn't run all the places that
Linux does (smartphones to mainframes). So, no, Linux is not wasted effort.

