Hacker News new | past | comments | ask | show | jobs | submit login
KLEE Symbolic Execution Engine (github.com/klee)
88 points by nateb2022 74 days ago | hide | past | favorite | 19 comments



The project's webpage is a more helpful starting point than the repository: https://klee-se.org/


I discovered the Klee engine when reading The Ph.D. Grind by Philip Guo[1], it gives an idea of how its development was going at the time and this is also why I have difficulty appreciating the software, an interesting read if you want to get a feel of the academic world.

[1] https://archive.is/KAz4n


Nowhere in the repo or the linked website does this explain what it does or why it's useful. What is this for?


Symbolic Execution is a term of art for a kind of program analysis using symbolic value inputs. https://en.wikipedia.org/wiki/Symbolic_execution


I used this in my software testing class. It was really cool! Although interfacing with LLVM was not very fun, I feel like I could’ve spent a lot more time learning the innards of this tool instead of fighting with llvm and C++.


Is anyone actually using symbolic execution in industry these days? I personally like it but I feel like fuzzing has completely taken over in terms of academic research at least.


It's certainly still being used. I use bounded model checking, which performs a kind of symbolic execution using an SMT solver.

For more complex things that aren't as easy to model check, such as the formal verification of a deeply recursive algorithm or a cryptographic algorithm against specification, I'll use Coq or Lean to define an abstract machine model, then either extract code from this model into C, C++, or machine code, or when something hand written is required, extract code into an intermediate format, then import hand written source code into this same format, then prove the two equivalent.

Fuzzing is useful, but fuzzing serves different goals. With fuzzing, you can find execution paths that are incorrect, with a reproducer. However, these paths must be reachable with the fuzzer within the search depth defined. Fuzzing can find failure cases, but it can't prove the absence of such cases. Model checking can be significantly faster if well engineered, and it can be used to prove the absence of classes of errors. The same goes with constructive proofs in a proof assistant. The two approaches are complementary. One doesn't replace the other.


> I'll use Coq or Lean to define an abstract machine model, then either extract code from this model into C, C++, or machine code,

Do you have any worked examples or blog posts that detail this process?


Not yet, but very soon. My previous work was done for employers and involves code that is not free to distribute.

I am currently working on some blog articles to cover a web application written in model checked C. Once that is done, and once my personal C parser library is complete, I'll write a series of articles on constructive and equivalence proofs of software written in C.

A bucket list item for me is to write a series of books on this, focusing less on the academic aspects and more on the practical aspects. There are plenty of excellent text books on this, but few "how to I write real world code that is model checked / constructively proven?" Unfortunately, this has led to a lot of odd opinions being passed around as fact, like that it takes 30 man years to write constructively proven code, or that model checking is impractical.


I’ve been using Triton a lot lately to make sense of heavily obfuscated code. The ability to translate a symbolic “trace” into an LLVM bitcode listing is super useful. You can do a lot of optimizations from there easily to recover the original code:

https://github.com/JonathanSalwan/Triton/wiki


Your deobfuscating process would make a good blog post :)


Yes, we do. Fuzzing takes forever and only finds errors by luck, whilst with cbmc you'll find all errors within short loops (=data).

Klee is dead for years.

Both needs extra setup, but my make verify targets are still much easier to setup and run, than make fuzz.


That's funny, I recently went to a talk by Cristian Cadar about developments in KLEE and it seems very much alive. Perhaps you are talking through your hat.


good to hear. I thought it's dead (llvm 4.7 only)


CBMC is a model checker though right, not a symbolic execution engine? Or do you feel for all intents and purposes there is no difference?


It's the very same symbolic execution engine as klee, which just a bit more features, and a bit simpler to use.



It should! Kani’s used by AWS’ Firecracker hypervisor.


I would like to - but unable to find any libraries / engines that would function in a some sort of 'mocked' environment. for example - i need to cover all execution paths for only single function with high cyclomatic complexity, given all other stuff is mocked. Maybe it would be called unit test mix with se. also for dynamic languages such as php/python/ruby.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: