Hacker News new | past | comments | ask | show | jobs | submit login

This is my guess also. Even in large engineering-driven companies, you often discover during your very proud (internal) talk about Shiny New Tech that other people in the group have been doing it for some time...

But you never know, maybe I looked in the wrong parts of the Internet.

I'm genuinely curious though: since most of those projects are open-source you'd expect to see some traces of some activity/problems on GitHub or on mailing-lists, and... well... Not much to be seen.

I've seen some rare papers with CBMC and I know the tech has been ported to Java, javascript and apparently something is in progress for Ada ?

angr I tried very hard to use on Ada programs but it seems the VEX lifter and CFG builder couldn't handle simple Ada programs (wasn't patient enough to submit this on GitHub...). But for C programs it's very fun and all scriptable ! There is also this nice of a 'pluggable test environment' (deepstate from Trail of Bits https://github.com/trailofbits/deepstate) that allows to switch between hand-written unit tests, fuzzers, and symbolic execution (letting you decide which variables are Symbolic, which are concrete...) with angr. Also some cool stuff mixing AFL and angr (driller) and another project mixing Intel PT and angr.

KLEE seems to have the most papers (quantity and quality), just landed support for C++ and upgraded to a recent llvm...

We at Trail of Bits are also working on "klee-native," a port of KLEE using Remill to dynamically lift and emulate binaries. This project is very early on, though.

Ah ! When I saw some time ago that there were useable x86_64-to-llvm lifters and rewriters (was it from Trail of Bits also ?) I wondered how hard it would be to plug with KLEE. Good for Trail of Bits !

(off topic: your blog articles and tech are very inspiring. The deepstate paper, articles and slides made me rethink the whole test pipeline of my applications, from manual oracle-based or model-based tests, to PBT, to fuzzing, to Symbolic/Concolic execution...).

By the way I thought remill and mcsema required IDA ?

Back on topic: One of the reasons I asked about industrial use, is that almost everyone in the security community (that works on SE) seems focused on reverse engineering executables. While I understand the interest and the importance of this part (being able to work on binary code without the source code) I think it's an adversarial view of the topic of Symbolic Execution. It's really hard working precisely on binary code, even if you disable optimizations...

Well, what if I have the source code and want to use SE ? Couldn't I be more precise/efficient ? I'm guessing I'm stuck with KLEE if I'm working in C/C++, and to write an frontend to llvm for Ada stuff...

There are a number of lifters. The McSema repo has a table detailing the features of most of them.

Glad that DeepState had an impact on you :-D We continue to evolve DeepState, both in the direction of better fuzzing, and better test case reduction.

Remill is instruction granularity, and so all it requires is raw bytes. McSema uses Remill in conjunction with a disassembly frontend (IDA Pro, Binary Ninja, or Dyninst).

If you have source code you can likely be more precise/efficient. Sometimes you may have access to source but not the ability to change/influence the build.

I think there's a lot of room for improvement with KLEE. If I were to write an LLVM symbolic executor from scratch then I think I would do some things differently.

Thanks pag for the correction on remill. And I didn't know about mcsema/dyninst, great !

Do you have something written somewhere on how you would do different from KLEE ?

Nope :-P

if you're interested in lifters, this research is a pretty eye opening peek into the guts of several, and how one goes about actually testing their accuracy


By the way, did you know that DeepState is actually pure C, and that all the C++ is just window dressing on top? This means that you could feasibly integrate components of it in Ada, or call to those components in Ada, assuming that you haven't already produced your own variants of things.

Tried to connect DeepState ans Ada through C indeed, but since I can't make angr work with Ada yet, it's not really useful right now. For fuzzing I think DeepState was limited to libfuzzer ? Which needs clang or someone to port this to gcc-gnat (humpf) and I'm not sure DeepState supported AFL yet (EDIT: just looked at the repo now, and I see support for AFL and eclipser, Nice !) ... Will retry all this soon though. Thanks for the reminder !

What I was thinking was some improvements to gnattest (https://docs.adacore.com/gnat_ugn-docs/html/gnat_ugn/gnat_ug...) but also a way to add some quickcheck-like generator features (Ada already has the property description language through contracts) with Libadalang... One can dream !

But the whole "Let the developer write one test harness and use it almost as-is with different testing/validation techs" (I'm not explaining it well) was some kind of revelation.

When you say 'adding a fuzzing test harness is only 2-3 days work' you still get complaints: that's too much (but I then found dozens of bugs), it's /another/ test harness to maintain, we'll have to rebuild an input corpus for every interface break (true...). Anything that could alleviate the pain would be great...

Hopefully we'll have a Dockerfile soon that gets things all set up :-D

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact