agroce's comments

agroce · on April 17, 2020

https://blog.trailofbits.com/2019/01/22/fuzzing-an-api-with-... provides a little more friendly way (with delta-debugging and save/replay, and ability to use more fuzzers) to do this, via DeepState

agroce · on Aug 29, 2019

If you want to fuzz a library by making a series of calls, DeepState has special support for that, but that won't handle some other issues with network applications of course.

agroce · on Jan 24, 2019

Good point! I added a little paragraph emphasizing the reasons you want libFuzzer etc. over a dumb fuzzer, in the long run.

agroce · on Dec 1, 2015

The SPLASH Onward! 2014 essay concludes with this advice to practitioners:

In some cases where coverage is currently used, there is little real substitute for it; test suite size alone is not a very helpful measure of testing effort, since it is even more easily abused or misunderstood than coverage. Other testing efforts already have ways of determining when to stop that don’t rely on coverage (ranging from “we’re out of time or money” to “we see clearly diminishing returns in terms of bugs found per dollar spent testing, and predict few residual defects based on past projects”). When coverage levels are required by company or government policy, conscientious testers should strive to produce good suites that, additionally, achieve the required level of coverage rather than aiming very directly at coverage itself [56]. “Testing to the test” by writing a suite that gets “enough” coverage and expecting this to guarantee good fault detection is very likely a bad idea — even in the best-case scenario where coverage is well correlated with fault detection. Stay tuned to the research community for news on whether coverage can be used more aggressively, with confidence, in the future.

agroce · on Dec 1, 2015

I'm biased (it's my research field, in part) but I'd suggest that studies on coverage are all over the place, with this one showing lack of correlation, and other studies showing good correlation between coverage and... some kind of effectiveness (the ICSE paper mentioned above, also from 2014, and a TOSEM paper coming out this year, as well as a variety of publications over the years).

http://www.cs.cmu.edu/~agroce/onwardessays14.pdf covers the Inozemtseva et al. paper as well as some other recent work, and nothing in the time since we wrote that has modified my view that the jury is still out on coverage, depending on the situation in which you want to use it. Saying "coverage is not useful" is pretty clearly wrong, and saying "coverage is highly effective for measuring all suites in all situations" is also clearly wrong. Beyond that, it's hard to make solidly supported claims that don't depend greatly on details of what you are measuring and how.

I suspect Laura generally agrees, though probably our guesses on what eventual answers might be differ.