Hacker News new | past | comments | ask | show | jobs | submit login
Kissat SAT Solver (fmv.jku.at)
111 points by veselin 15 days ago | hide | past | favorite | 60 comments



If anyone else is wondering how the hell to use this thing... the main API is here:

    https://github.com/arminbiere/kissat/blob/master/src/kissat.h
In particular all you need to add clauses is this single API entry point:

    void kissat_add (kissat * solver, int lit);
The solver consists of the conjunction of disjunctions (I hope I have that correct — maybe the other way around? I can never remember). Variables are assumed to be any non-zero integer:

    kissat_add(s, [N != 0]);
You create a clause by repeatedly calling kissat_add; the clause is terminated by a 0. For instance if {shirt=1, tie=2}, then "!shirt or tie" is:

    kissat_add(s, -1); kissat_add(s, 2); kissat_add(s, 0);
Once the solver is run, if there's a solution, you get the solution using `kissat_val`:

    kissat_val(s, 1);
    kissat_val(s, 2);
I hope.


It supports the DIMACS input format [1], which you can parse using the kissat_parse_dimacs function [2].

DIMACS is a pretty terrible format, but at least you don't have to build the clauses up yourself using kissat_add.

[1] http://www.cs.utexas.edu/users/moore/acl2/manuals/current/ma...

[2] https://github.com/arminbiere/kissat/blob/master/src/parse.h


SAT solvers use CNF (conjuction of disjunctions).

DNF-SAT can be trivially solved in O(n), because each clause directly encodes a solution (except for clauses that are contradictory, of course). But there's no known way to "convert" arbitrary formulas to DNF in polynomial time, so such a solver is not useful.


How do you interrupt the computation, e.g. when it takes too long?


You could always fork the process and send SIGKILL to the child after a timeout.


No need to fork; you could install timer using alarm(), setitimer(), or timer_create() and friends.


But how would you clean up the heap and restart the library?


I think you can call kissat_terminate() from the signal handler.


To save the unwashed masses (myself included) any further internet searches, this relates to the Boolean satisfiability problem, which according to Wikipiedia [0] is:

"the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. If this is the case, the formula is called satisfiable."

I originally thought it was an AI to solve college placement tests, which would also have been cool but in a different way.

[0]: https://en.wikipedia.org/wiki/Boolean_satisfiability_problem


SAT solvers are really powerful machines. You can fit many problems in a Boolean formula and then ask the solver to work it out. Now, the solver is not always optimized for your task but sometimes it helps. Also, the exercise of turning a problem into a (usually huge) Boolean formula is interesting. For example, I solved the "find a longest path in a graph" with a SAT solver (TBH, I quickly replaced that with a specialized solver, but, well, that was fun !)


SAT is the prototypical NP-complete problem.


And indeed I think the reduction of NP problems to SAT is pretty intuitively understandable, at least at a hand-wavy level. If a decision problem is in NP, then given some candidate solution to the problem we can verify it in polynomial time. If you imagine a computer implementing this decision algorithm, and you squint your eyes, you can imagine it as a huge boolean satisfiability problem. The program takes some input (as bits aka boolean variables) and the program is run in the circuitry of the computer which at the bottom is effectively doing boolean algebra with logic gates and such. And so if you can solve SAT in polynomial time, then you can run it on this instance of SAT (the instantiation of the verifier for this NP problem) and find out if there's any input to that NP problem that satisfies it. Obviously there's a lot more work to make it formal.


While it is true, if you follow current research, most people in the field seem to be on a great hunt for hard SAT instances where no known heuristic really helps. As of yet this hunt has not been successful.


A hard SAT problem:

Let nonce be 32 boolean inputs, representing numbers from 0..2^32-1.

Let other_bits and difficulty be a collection of constant boolean inputs, in the following expression.

Let f(other_bits, nonce, diff) be the boolean logic circuit used in Bitcoin mining, for a recent candidate block.

Roughly speaking, f combines nonce with other_bits to make a bit string, performs SHA-256 on the bit string, then SHA-256 on that, and then checks if the leading diff number of bits are zero.

Folding in the constants, f reduces to a boolean logic circuit with 32 inputs and 1 output, g(nonce) = result.

The SAT problem: Does there exist a value of nonce for which g(nonce) = 1?

If you can solve this quickly, you just sped up Bitcoin mining by a factor 2^32 iterations, which is a lot. You can take this further as well.

I don't know of any SAT solver heuristic which really helps with this.


This SAT instance is huge, which makes it hard. But you haven't shown that it is a harder SAT instance than a random one of its size.

It's all about the difficulty vs the size of the instance.


To the best of my knowledge, the example has no known heuristics for finding a solution that are significantly faster than brute force.

That puts it in the hardest class of instances for its size (subject to available knowledge), because all instances can be solved by brute force.


A SAT solver might very well have optimizations that make it end up running in 2^200 on average instead of what brute force would give.

The problem is, since the instance is so large, we don't know what happens. No one has ever ran one to completion.

So no, it's not in the hardest class of instances for its size. Its difficulty is unknown.


You can encode as moderately large (low hundreds of clauses/variables) SAT instances questions of Ramsey-theory and other unsolved combinatorics and those instances will not be solved by any heuristic.


> As of yet this hunt has not been successful.

How current is your information ?

I think I've seen random instances in some of the more recent SAT competitions with only a few thousand variables that have not been solved. That said in theory there should be 100 variable instances that are too hard to solve but I haven't seen any really hard instances that small.


I was in Lissabon Last year at the SAT conference and that was my main takeway, though my field of research is only adjacent.

One paper was about creating random sat instances where the probability of unsat ist 50/50 and as it turned out: none of those instances was hard.

You can always craft hard instances to defeat a specific heuristic. But crafting one that is hard for all of them is the problem.


Thank you! I clicked all the links on the page and still had no idea what SAT or the SAT Competition was!


This is very exciting! It not only has much improved performance, but it's also clean code, easily extensible for embedding in other applications (I do such a thing myself). Sadly I can't use it yet due to lack of incremental solving.


For obscure reasons I know that this is the Finnish word for 'cats', and I wondered if there was any connection there. Apparently the answer is yes

https://github.com/arminbiere/kissat

or at least the developers have already noticed this coincidence for themselves.


Are there any software development problems we should be solving with SAT solvers instead of using heuristics?


Dart's package manager (Pub) uses an SAT solver to find possible solutions, and even more impressively, to describe why a dependency can't be fulfilled:

    Because dropdown >=2.0.0 depends on icons >=2.0.0 and root depends
  on icons <2.0.0, dropdown >=2.0.0 is forbidden.

    And because menu >=1.1.0 depends on dropdown >=2.0.0, menu >=1.1.0
  is forbidden.

    And because menu <1.1.0 depends on dropdown >=1.0.0 <2.0.0 which
  depends on intl <4.0.0, every version of menu requires intl
  <4.0.0.

    So, because root depends on both menu >=1.0.0 and intl >=5.0.0,
  version solving failed.
https://medium.com/@nex3/pubgrub-2fb6470504f


PHP's dependency manager, Composer, uses this approach too.


Any problem that works.

IIRC: SAT Solvers regularly solve 30,000+ nodes of traveling salesman problem in just seconds (Or something along that magnitude. I forget exactly...). Yeah, its NP complete, but that doesn't mean its worthless.

It will take some experimentation to figure out if your problem is within the "feasible" category of SAT solving. I don't think there's any method aside from "try it, and if it works, be happy. If not, be sad".

If we knew which problems were easily solved with SAT-solvers, we probably would be well on our way to proving N == NP ?? So just guess and check.

--------------

Honestly, people may forget that brute-force solvers can be incredibly useful. When I was playing with a hash-algorithm I was inventing (just for fun on the side...) I searched the entire 32-bit space for the best integer that best mixed-up the bits from input to output. The brute-force solver took only seconds to try every 32-bit number.

We can brute-force a surprising number of "intractable" problems from our college years. Yeah, its inefficient, but who cares if its solved in seconds or minutes?

There's also that a^3 + b^3 == c^3 problem that was solved in a few weeks by brute forcing every number (or something like that).


> There's also that a^3 + b^3 == c^3 problem that was solved in a few weeks by brute forcing every number (or something like that).

Huh?


Yeah no idea, where they got that from. Fermat argued in the 1630s that this was impossible, Euler then proved it in the 18th century.

[Actually, Fermat claimed that a^n+b^n=c^n is impossible for n>2, but the proof for that wasn't discovered until the 1990s. It's Andrew Wiles's famous proof of Fermat's Last Theorem.]


https://en.wikipedia.org/wiki/Sums_of_three_cubes

I was wrong and confused things for a different problem.

a^3 + b^3 + c^3 == X, where X is some "hard" number, like 42.

They discovered the a, b, and c values for X == 42 recently, by simplifying the problem over 2 variables and then brute forcing for weeks.

----------

Anyway, I was trying to reference the sum of 3-cubes issue.



There are only so many numbers, we simply tried them all. It took a really large computer. /s


You have to revalidate your assumptions from time to time.

I proved to a vendor they had screwed up CSPRNG seeding by doing a scatterplot of the first result from a large number of seeds. It ran so fast I bumped the loop counter up to max_int and it still ran in a minute or two on a laptop from 10 years ago.

When you can fit everything in cache, a CPU can sure run a lot of arithmetic in no time at all.


Brute-force numeric techniques are more fun on GPUs by the way.

4096 SIMD cores (163840 "CUDA threads" I guess) on a Vega64 at 1.5GHz with single-clock add / subtract / multiply / AND / OR / XOR instructions. Brute force is surprisingly parallel, because there's no "shared information" between threads. So its super easy to program and parallelize.

I started searching 40+ bit numbers to see how far up I could go while staying under 1-minute runtimes.

-----------

If you can keep your program state in under 100-bytes, so that your entire program fits inside of the Register-space of GPUs... things go fast. Extremely fast.


So you could convert TSP to SAT, solve it in SAT and get result back as optimal path?

any link?


The "Toy" problem is Sudoku -> 3SAT -> Solution -> Convert back. Sudoku is actually NP complete and therefore is the smallest toy example recently.

N-Queens was popular before Sudoku become a thing, and you might find some basic tutorials on N-Queens -> SAT solvers. Also an easy problem that's fun to solve on your own.

------

TSP is... very complicated to reduce to 3SAT. Try it after you understand the other examples. I did a search online, and found a good set of slides here: https://www.cs.cmu.edu/~ckingsf/bioinfo-lectures/sat.pdf

The tl;dr: TSP -> Hamiltonian Cycle -> 3SAT -> Solver.

http://www.math.uwaterloo.ca/tsp/sweden/index.html

------

Because all NP-complete problems convert into each other, its possible to convert other NP Complete problems (ie: Sudoku) into a TSP problem, then use TSP-solvers to answer the problem.

3SAT just seems to be the one NP-complete problem that everyone's reducing to. There's an implicit assumption that its easiest to convert to 3SAT.


Such is the nature of NP Complete problems, like SAT. They are the most general versions of NP problems, and all NP problems can be phrased as any NP Complete problem.

I don't know if there's an established formalised terminology for this type of conversion, and I haven't found one from a quick web search. But look up something like "translate", "convert", "reduce", "encode", or the like in conjunction with "NP Complete" and you shall find what you are looking for.


> I don't know if there's an established formalised terminology for this type of conversion

Technically they're polynomial-time many-one reductions.

https://en.wikipedia.org/wiki/Many-one_reduction

But in practice I think people usually just call them "reductions".


Not strictly software development but in the general area: routing cables between devices in a rack

https://web.archive.org/web/20190415054049/https://yurichev....

Inspired by that, I tried to use Z3 to solve the "best arrangement of letters on a linear typewriter" problem but never got a good answer because the solution space is far too large (26! which is seven hundred ninety-eight trillion, seven hundred twenty billion, nine hundred fifty-two million, one hundred seventy-six thousand, seven hundred and forty-five years at 16000 iterations a second.)


The idea is that for many problem instances the heuristic solvers can find a solution far more quickly than by exhaustive search, but maybe that wasn't the case for your particular problem.


It depends on how you define "software development problem," but:

* SAT (more specifically, SMT) has been applied with some success to automated vulnerability discovery via symbolic execution. Underconstrained symex has cropped up more recently as a strategy for minimizing state space explosions.

* Others have mentioned register allocation, but SAT/SMT is broadly applicable to compilation problems: instruction selection and superoptimization[1] come to mind.

[1]: https://github.com/google/souper


here's a cool example from somebody at Microsoft:

https://medium.com/@ahelwer/checking-firewall-equivalence-wi...

it's not a pure SAT problem, but Z3 works (in a way I don't understand) by reducing constraint problems involving integers and even more complicated domains into SAT problems.


SMT solvers [0] like Z3 are very powerful things. They're used to power deductive software verification tools. We're discussing one such at [1].

[0] https://en.wikipedia.org/wiki/Satisfiability_modulo_theories

[1] https://news.ycombinator.com/item?id=23997128


Not quite a problem as per se but it would be nice to have some kind of SMT solver inside a compiler to verify well-behaved-ness of certain functions. I'm not exactly sure whether you'd need something a bit more advanced for control flow analysis but beyond simple variable assignment checking (imagine Ada style bounds) I would quite like a language with a verifier like eBPF uses to reduce bug surface area (i.e. Not all code needs to be Turing complete). I think there are langauges with features like that but none aimed at mainstream use beyond research and/or defence etc.

I would like to write a language like that but I don't know enough about computer logic / hard ai etc. to know where to go in terms of the internals of the compiler (Bolting complicated analysis onto an existing compiler is really really ugly if there is a lot of global state - e.g. some parts of the D compiler although that's still WIP to be fair to the programmers)


Stainless [0] does this well for Scala. You augment function definitions and return statements with `require()` and `assert()` respectivley.

[0] https://stainless.epfl.ch/


In recent years there has been a lot of improvement in refinement types which is, from what I understood, what you are describing. See for example Liquid Haskell that uses Z3 solver for static verification.

https://ucsd-progsys.github.io/liquidhaskell-blog/

I'd say that when dependent + refinement types get more mature and get pass the basic prototype idea in research languages/environments, these ideas could easily transfer to mainstream static typed languages and be a new module of future compilers.


If you have a combinatorial problem that your first inclination is to use a brute-force backtracking-based search then maybe you could you a SAT solver instead. It is a declarative way to solve the problem and good SAT solvers employ some very clever heuristics such as non-chronological backtracking and clause learning, which you might not get if you try to write the backtracking search yourself.

One example that I like is that some Linux package managers have switched to using a SAT solver to satisfy package version constraints.


Various sorts of scheduling problems come to mind... for example compilers can make good use of a fast sat solver; but typically you want a failover heuristic so you won't spend forever on large or unsolvable problems


Register allocation in compilers can be solved by graph colouring, which can be reduced to SAT and fed to a SAT solver.


I've been looking to get my feet wet with SAT and/or SMT solvers; can anyone recommend any short courses that I could start with to get a practical feel where where & how to use them? Python would be preferred.

Maybe this https://www.coursera.org/learn/automated-reasoning-sat would be a good start?

Ideally I would like to be able to take predicted outputs from ML plus business requirements to solve allocation problems; and more pie in the sky I would like to know enough to get inspiration for how to link a neural network with a sat solver, such as perhaps replacing or augmenting beam search in sequence decoding tasks.


Question: in college, I was lead to believe that a polynomial time SAT solver would be a huge breakthrough in many fields. Why is that the case? I know 3SAT is a reduction to NP problems, but besides that I feel like it wouldn't be particularly useful on its own.


If 3SAT is solvable in polynomial time, then P=NP. Which means that any problem that can be solved by brute-forcing an exponential number of possibilities (and doing a polynomial amount of checking for each) can also be solved with an asymptotically much smaller (polynomial) amount of work.

This includes anything from complicated optimization problems, to theorem proving, to inverting cryptographic hash functions, to any other NP algorithm that hasn't been thought up yet.

Whether this would have any practical use would depend on the polynomial exponents and constant factors, of course. But even in a theoretical sense it would be very, very surprising to a lot of people if it were true.


Donald Knuth apparently believes, in a very unusual view for a computer scientist, that P=NP but that the best algorithm has such bad exponents and factors that it won't have relevance for our computational practices -- if we're even able to discover it.

https://www.informit.com/articles/article.aspx?p=2213858&ran...


I think that's it. If you can reduce your NP problems to 3SAT and 3SAT is in P then all P=NP.


It'd be a breakthrough, but only for problems big enough that the cost (polynomial & constant) of translating to and from SAT would be worth it for the problem at hand.

Also P=NP would be a very big deal.


I don't think the cost of the reduction is usually the problem. Most reductions to 3-SAT I've seen are either linear or can be made linear with a little work. For example any boolean circuit can be reduced to a <=3-SAT instance with a number of variables proportional to the number of gates.

The problem is that current solvers are heuristic and may fail or take too long (ie. degrade to exponential time) on any particular instance.

If a polynomial-time algorithm is ever discovered for all 3-SAT instances (ie. P = NP) then of course its actual complexity would be the determining factor.


3SAT is a NP-complete problem which means if you can solve(not just check the solution) it in polynomial time, then you can solve all NP problems in polynomial time.


I wonder how big of an economic impact such improved algorithms have. Must be in the billions of dollars in the long run...


SAT Solvers are used in software that is used in creating the very complex Silicon Chips. Known as Formal in the semiconductor design industry.


Computational logic seems to be a more popular research subject in Europe than in the United States.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: