Hacker News new | past | comments | ask | show | jobs | submit login
DustMite: A General-Purpose Data Reduction Tool (dlang.org)
66 points by aldacron 6 months ago | hide | past | favorite | 6 comments



Seems like a relatively standard test case reduction tool aimed for D.

The tool reduces test cases to make them easier to debug. It removes pieces from input data and testing each variant to see if it still reproduces the bug.

In the end you have a much smaller reproducer which is easier to debug. Here's a tutorial for gcc [2].

Most serious compilers use one in some shape or form, e.g. delta[1] (which probably was the first widely used one) or creduce[3] (which hugely improved the state of the art for C), and also various descendants reducing on something different (like LLVM bugpoint [4]). A lot of the original ideas go back to Andreas Zeller's delta debugging [5]

Somehow the blog author forgets to mention this rich history.

[1] http://delta.tigris.org/

[2] https://gcc.gnu.org/wiki/A_guide_to_testcase_reduction

[3] http://embed.cs.utah.edu/creduce/

[4] https://llvm.org/docs/Bugpoint.html

[5] https://en.wikipedia.org/wiki/Delta_debugging


The forum post he linked to where he announced DustMite says it's inspired by Tigris Delta and lists some advantages:

https://forum.dlang.org/post/op.vvsvhh1ptuzx1w@cybershadow.m...


This is cool, in any case.

TL;DR: DustMite feeds reductions/variations of a data set (like your source code) into an oracle which tests if it satisfies some property. The primary example is reducing your source code to a local minimum that still exhibits some compiler failure.

The article's conclusion notes some other cool uses; my favorites were:

- "reducing a large commit to a minimal diff"

- "reducing a commit list, when git bisect is insufficient due to the problem being introduced across more than any single commit;"

- "reducing a large data set to a minimal one, resulting in the same code coverage, with the purpose of creating a test suite;"

- "if you have complete test coverage, it can be used for reducing the source tree to a minimal tree which includes support for only enabled unittests. This can be used to create a version of a program or library with a test-defined subset of features."


I would be interested in a performance comparison of delta debugging tools, in my experience the performance of this type of tool has not been great (to be fair I have only tried creduce, delta, and my own coding experiments), although I might give dustmite a try


Is this the intended entry point? should this have linked to https://dlang.org/blog/2020/04/13/dustmite-the-general-purpo... or similar?


We've changed to that from https://forum.dlang.org/post/wntuwcsudlzrmkwrsdxe@forum.dlan.... Thanks!

(I also detached your other comment so that it can float to the top and get more attention.)




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: