KLEE Symbolic Execution Engine

MaxBarraclough · 2024-08-27T20:58:04 1724792284

The project's webpage is a more helpful starting point than the repository: https://klee-se.org/

vidalee · 2024-08-28T10:10:30 1724839830

I discovered the Klee engine when reading The Ph.D. Grind by Philip Guo[1], it gives an idea of how its development was going at the time and this is also why I have difficulty appreciating the software, an interesting read if you want to get a feel of the academic world.

[1] https://archive.is/KAz4n

xanzi2048 · 2024-08-28T04:57:40 1724821060

Nowhere in the repo or the linked website does this explain what it does or why it's useful. What is this for?

loeg · 2024-08-28T05:04:16 1724821456

Symbolic Execution is a term of art for a kind of program analysis using symbolic value inputs. https://en.wikipedia.org/wiki/Symbolic_execution

e1gen-v · 2024-08-29T00:38:08 1724891888

I used this in my software testing class. It was really cool! Although interfacing with LLVM was not very fun, I feel like I could’ve spent a lot more time learning the innards of this tool instead of fighting with llvm and C++.

anonymousDan · 2024-08-27T21:06:44 1724792804

Is anyone actually using symbolic execution in industry these days? I personally like it but I feel like fuzzing has completely taken over in terms of academic research at least.

nanolith · 2024-08-28T02:16:11 1724811371

It's certainly still being used. I use bounded model checking, which performs a kind of symbolic execution using an SMT solver.

For more complex things that aren't as easy to model check, such as the formal verification of a deeply recursive algorithm or a cryptographic algorithm against specification, I'll use Coq or Lean to define an abstract machine model, then either extract code from this model into C, C++, or machine code, or when something hand written is required, extract code into an intermediate format, then import hand written source code into this same format, then prove the two equivalent.

Fuzzing is useful, but fuzzing serves different goals. With fuzzing, you can find execution paths that are incorrect, with a reproducer. However, these paths must be reachable with the fuzzer within the search depth defined. Fuzzing can find failure cases, but it can't prove the absence of such cases. Model checking can be significantly faster if well engineered, and it can be used to prove the absence of classes of errors. The same goes with constructive proofs in a proof assistant. The two approaches are complementary. One doesn't replace the other.

skulk · 2024-08-28T14:25:42 1724855142

> I'll use Coq or Lean to define an abstract machine model, then either extract code from this model into C, C++, or machine code,

Do you have any worked examples or blog posts that detail this process?

nanolith · 2024-08-28T16:51:34 1724863894

Not yet, but very soon. My previous work was done for employers and involves code that is not free to distribute.

I am currently working on some blog articles to cover a web application written in model checked C. Once that is done, and once my personal C parser library is complete, I'll write a series of articles on constructive and equivalence proofs of software written in C.

A bucket list item for me is to write a series of books on this, focusing less on the academic aspects and more on the practical aspects. There are plenty of excellent text books on this, but few "how to I write real world code that is model checked / constructively proven?" Unfortunately, this has led to a lot of odd opinions being passed around as fact, like that it takes 30 man years to write constructively proven code, or that model checking is impractical.

nekitamo · 2024-08-27T21:59:27 1724795967

I’ve been using Triton a lot lately to make sense of heavily obfuscated code. The ability to translate a symbolic “trace” into an LLVM bitcode listing is super useful. You can do a lot of optimizations from there easily to recover the original code:

https://github.com/JonathanSalwan/Triton/wiki

__alexander · 2024-08-27T22:18:14 1724797094

Your deobfuscating process would make a good blog post :)

rurban · 2024-08-28T05:10:48 1724821848

Yes, we do. Fuzzing takes forever and only finds errors by luck, whilst with cbmc you'll find all errors within short loops (=data).

Klee is dead for years.

Both needs extra setup, but my make verify targets are still much easier to setup and run, than make fuzz.

angry_octet · 2024-08-28T10:58:00 1724842680

That's funny, I recently went to a talk by Cristian Cadar about developments in KLEE and it seems very much alive. Perhaps you are talking through your hat.

rurban · 2024-08-28T12:34:02 1724848442

good to hear. I thought it's dead (llvm 4.7 only)

anonymousDan · 2024-08-28T08:26:35 1724833595

CBMC is a model checker though right, not a symbolic execution engine? Or do you feel for all intents and purposes there is no difference?

rurban · 2024-08-28T12:34:49 1724848489

It's the very same symbolic execution engine as klee, which just a bit more features, and a bit simpler to use.

drdrey · 2024-08-27T21:48:23 1724795303

does Kani count?

https://github.com/model-checking/kani

PartiallyTyped · 2024-08-28T01:07:12 1724807232

It should! Kani’s used by AWS’ Firecracker hypervisor.

topazas · 2024-08-27T21:35:15 1724794515

I would like to - but unable to find any libraries / engines that would function in a some sort of 'mocked' environment. for example - i need to cover all execution paths for only single function with high cyclomatic complexity, given all other stuff is mocked. Maybe it would be called unit test mix with se. also for dynamic languages such as php/python/ruby.