
Writing a Test Case Generator for a Programming Language - azhenley
http://fitzgeraldnick.com/2020/08/24/writing-a-test-case-generator.html
======
DonaldPShimoda
Great article! Test case generation is pretty neat stuff! :)

\---

The author mentions John Regehr.

One of John's earlier projects was Csmith, which was a specialized test case
generator for C. It was hugely successful in identifying hundreds of bugs in
various C compiler implementations, including LLVM and GCC.

The same research group (but without John now because he's working on other
things) is working on a project called Xsmith. Xsmith is essentially the
answer to the question "What if Csmith could be made to work on _any_
language?" Xsmith constitutes a DSL for specifying a programming language
(with a lot of the tricky technical stuff handled behind-the-scenes), so you
as the developer use Xsmith to specify your language and — POOF! You now have
a test case generator for your language.

Development is ongoing, but the project has promise.

More specific to this post: an MS student in the project is working on a WASM
generator using Xsmith, which has helped guide a lot of the development of the
project.

I make no assertions about the usability of Xsmith at the present time. As I
said, development is ongoing.

\---

Link to Xsmith repo:
[https://gitlab.flux.utah.edu/xsmith/xsmith](https://gitlab.flux.utah.edu/xsmith/xsmith)

Link to Wasm generator: [https://gitlab.flux.utah.edu/xsmith/webassembly-
sandbox](https://gitlab.flux.utah.edu/xsmith/webassembly-sandbox)

~~~
wtetzner
Seems like it might be a nice companion to Redex [0] for defining a new
programming language.

[0] [https://redex.racket-lang.org/](https://redex.racket-lang.org/)

~~~
DonaldPShimoda
Yeah, I suppose you could do that!

But I will say that Xsmith is designed to be self-contained. You specify both
the syntax and semantics of your language within the DSL to produce a random
program generator. Of course, at the moment Xsmith's semantic capabilities are
more limited than Redex's, but it's also designed to solve a different problem
than Redex is.

Maybe it would be neat to have an automated method of translating from Redex
to Xsmith's DSL though... hm. I'll have to think about that!

~~~
touisteur
Hi, big fan of the work done on c/xsmith.

Would be great if it was using some existing infrastructure instead of asking
to write in yet another DSL.

I'm thinking of an interesting thread some time ago
[https://news.ycombinator.com/item?id=21917927](https://news.ycombinator.com/item?id=21917927)

~~~
DonaldPShimoda
The reason Xsmith is a fresh project is because the PhD student doing most of
the implementation wanted to be able to make extensive use of Racket's macro
system. Using metaprogramming allows for all the DSL forms to be first-class
instead of, say, needing to run a program to generate the program that will
then produce the program to create the language's test cases. Of course, this
style of metaprogramming has downsides (for example, the start-up time is
pretty lengthy, and adding new forms to Xsmith can be tricky if you're not
familiar with the Racket macro capabilities), but overall it seems the
benefits may outweigh the detriments.

However, Xsmith's core is built upon RACR [0], a Scheme library for building
what are called _attribute grammars_. This isn't quite what you were talking
about, but at least it shows that not _all_ of Xsmith is 100% novel
implementation. :)

[0] [https://github.com/christoff-buerger/racr](https://github.com/christoff-
buerger/racr)

~~~
touisteur
Hi thanks for the details, I didn't know racket that much before and I went
diving into your links and that looks very interesting.

I wasn't disparaging the approach at all, sorry if it came across like this.
Research is research, anyone willing to spend years of their life digging into
such a subject should use the most appropriate tech and follow their
intuition.

I'm just sad in general about the effort expected of developers (me!) or
testers (me!), e.g. to write grammars in 3 or 4 dialects to be able to
leverage new cool tools ;-) I _know_ they all have their specificities, I wish
there was a way to express them in one place/format.

I hope this work allows some time in the future to test language parsers,
compilers, interpreters, pretty-printers, so I can finally get some kind of
answer to 'how do you test it'.

