
Z3 SMT solver interactive tutorial - vmorgulis
http://rise4fun.com/z3/tutorial/guide
======
ahelwer
If you're looking for a fun, substantial introductory Z3 project, try writing
a program to compare two sets of firewall rules for functional equivalence.
Stated another way: does there exist a packet which is accepted by one
firewall but not the other?

I was lucky enough to be tasked with solving this problem in Azure! We were
changing the method by which node firewall rules were automatically generated,
and wanted a way to verify equivalence between the old and new versions for
each node. The brute force approach was clearly out: we'd have had to check
2^72 IPv4 packets (32-bit source address field, 16-bit source port field,
16-bit destination port field, 8-bit protocol field). I bummed around trying
to figure out a way to transform the firewall rules into some kind of minimal
normalized form for direct comparison, but the Z3 approach turned out to be
shockingly simple. When Z3 is really well-suited for a problem, it feels like
straight up magic.

If you want to spoil the implementation details for yourself, it goes like
this:

1) Set up bit vector variables of the proper size for each field in the IPv4
packet header.

2) Encode each firewall rule as a restriction on the variables from (1), then
conjoin them to form a logical representation of a firewall.

3) Task Z3 with finding values for the variables in (1) for which one firewall
is true, and the other is false.

Here's a simple example which you can run directly in the linked demo page,
using two integer fields (source and destination) for simplicity:

    
    
      (declare-const src Int)
      (declare-const dst Int)
      (define-fun matches ((srcLower Int) (srcUpper Int) (dstLower Int) (dstUpper Int)) Bool
        (and (<= srcLower src) (<= src srcUpper) (<= dstLower dst) (<= dst dstUpper))
      )
    
      (define-fun firewall1 () Bool
        (and
          (matches 0 10 20 30)
          (not (matches 5 10 25 30))
        )
      )
    
      (define-fun firewall2 () Bool
        (and
          (matches 1 10 20 30)
          (not (matches 5 10 25 30))
        )
      )
    
      (assert (not (= firewall1 firewall2)))
      (check-sat)
      (get-model)
    

This finds the model [0, 20] - a "packet" which is accepted by one firewall
but not the other.

In practice this solution ran very quickly - less than a second for firewalls
with hundreds of rules. It also gave a counterexample if the firewalls were
not equivalent, which could be matched with offending rules and was very
useful for debugging.

~~~
csl
This was running user-facing, in production? Do you have any experience with
scaling Z3? I'm thinking large problems running on many machines, contra many
small problems on many processes.

For some reason, I always imagined that SMT was used to verify stuff _before_
putting it into production, but this opens up a whole new world of ideas for
me.

~~~
ahelwer
It wasn't running as any sort of persistent service, if that's what you mean.
It just ran during deployments to check output of old & new firewall
generation methods for equivalence.

Something like what you describe was implemented in an Azure system called
SecGuru, which is sort of a generalized version of firewall comparison to
check security rule enforcement. Here is the whitepaper:
[https://www.microsoft.com/en-us/research/wp-
content/uploads/...](https://www.microsoft.com/en-us/research/wp-
content/uploads/2016/02/secguru.pdf)

------
tikhonj
If you want a fun example of using Z3, I gave a talk recently that walks you
through verifying programs in a tiny little imperative language by compiling
them to Z3 formulas. It's easier than it sounds!

It was recorded [1] and I have example code and additional resources (papers,
etc) on my site [2].

[1]: [https://youtube.com/watch?v=ruNFcH-
KibY](https://youtube.com/watch?v=ruNFcH-KibY)

[2]: [http://jelv.is/talks/compose-2016/](http://jelv.is/talks/compose-2016/)

If you want to do more, you can extend the same process to _synthesizing_
programs, not just verifying them. It doesn't scale too far, but otherwise
it's like magic.

~~~
pron
Very nice! Check this out:
[https://youtu.be/G2kx1aFpT6M?t=28m](https://youtu.be/G2kx1aFpT6M?t=28m)

BTW, TLA+ also compiles to Z3 (and other solvers: CVC3, Yices, Zenon).

------
zielmicha
Z3py (Z3 bindings for Python) is a much better interface for Z3 IMO.

[http://www.cs.tau.ac.il/~msagiv/courses/asv/z3py/guide-
examp...](http://www.cs.tau.ac.il/~msagiv/courses/asv/z3py/guide-examples.htm)

~~~
eggy
I like the Lisp/Scheme syntax better, but that is because I have studied Lisps
far longer than Python, and I just find them easier for phrasing logic.

The Python looks clean and familiar too.

F* the theorem-proving, program-verification language from MS is ML-like, and
exports to OCaml and F#.

Can the Z3 SMT solver be used for F*'s SMT needs? I haven't found a link yet.

~~~
ergl
Yes, afaik F* uses Z3 internally.

I don't have any links, but I know a guy working on formalizing the
translation from F* to Z3

~~~
eggy
I guess since F* is similar to F# or ML you could have a pretty cohesive
solver to production toolchain. Having OCaml as an output option is great too.

I was looking at Idris with the C backend as a way of formalizing my
programming for production programming. This makes another set to evaluate
before deciding. I realize they are quite different, but I am just
experimenting at this moment.

------
mettamage
If Z3 is on here, I just want to mention Triton (
[http://triton.quarkslab.com/](http://triton.quarkslab.com/) ) which is a
master thesis project from a French (?) person and Pin from Intel (
[https://software.intel.com/en-us/articles/pin-a-dynamic-
bina...](https://software.intel.com/en-us/articles/pin-a-dynamic-binary-
instrumentation-tool) ). With Triton and Pin you can instrument binary
software and gather SMT formulas. It basically allows you to know about the
binary SMT style.

I had to use it in a practical lab to figure out the input of some hashing
function (that would almost always hash to 0). By knowing all the constraints,
solving it to know what the input should be becomes quite easy (at least easy
compared to brute force).

