Add experimental fuzz test support for Go 1.17

benhoyt · on Feb 23, 2021

Go tests and benchmarks are so easy to write and run: just add TestFoo and BenchmarkFoo functions to a bar_test.go file, and "go test" does the rest. It's currently doable, but it requires a 3rd party library (go-fuzz) and a bit of fluffing around. This will make fuzz testing an equally first-class citizen with standard Go tooling (just add FuzzFoo), and as such we'll probably see a lot more people testing with fuzzing.

I used go-fuzz in GoAWK and it found several bugs (see https://benhoyt.com/writings/goawk/#fuzz-testing), and almost everyone who's done fuzz testing has similar reports. Certainly go-fuzz has found many, many bugs in Go itself: https://github.com/dvyukov/go-fuzz#trophies

For what it's worth, I wrote an article for LWN about the upcoming support for built-in fuzzing in Go: https://lwn.net/Articles/829242/ (of course, if you want full details, read the full proposal).

stevekemp · on Feb 24, 2021

I remember emailing you on the topic of fuzzing goawk, after I'd fuzzed GNU Awk.

I continue to believe that fuzzing is magical, having configured it in as many of my projects as I can. I've definitely found issues in things that I thought I'd covered in my normal tests, and I'd definitely welcome it become more widespread.

benhoyt · on Feb 24, 2021

Yes, I remember, thank you! That's how I got started with go-fuzz.

zimpenfish · on Feb 23, 2021

> and almost everyone who's done fuzz testing has similar reports.

Added fuzzing at a recent place and yep, everything went to hell. About to add fuzzing at current place where everything will almost certainly go to hell but at least there's institutional will to fix the issues and no legacy crippling what fixes can be made.

latch · on Feb 24, 2021

I never understood their reason for the verbose and unfriendly ergonomics:

if a != b { t.Errorf("expected %s to be equal to %s", a, b) }

Instead of something like:

t.Expect(a).To.Equal(b)

And, of course, it gets uglier when you're trying to test a function that returns a error,value tuple, which, again, is something that a more friendly testing library can help with.

benhoyt · on Feb 24, 2021

It's because the Go philosophy is "tests are code", so you use the standard language features like != instead of a whole new domain-specific language that you have to learn. I've written more about that here: https://lwn.net/Articles/821358/

Personally I find the "cutesy" names like Expect(a).To.Equal(b) and It("should ...") patronizing and too English-like (code is not a human language). Similarly, I wouldn't like a language that made you write:

   Please.Print(1.Plus.2)
   Log.To(Console).Text("did something")

The vanilla Go approach can get a bit verbose when you're doing lots of asserts. In that case, you can use standard language features like refactoring things to variables or helper functions, or table-driven tests.

But if you really want assert-style code, I like the https://labix.org/gocheck approach:

   c.Assert(a, Equals, b)

Because there's only one function (Assert) and a bunch of comparator names to learn.

jzoch · on Feb 24, 2021

While I dont care much for the fluent cutesey langauge as you call it - the error messages and consistency are where it shines. Getting an entire team to write similar error messages in test cases is a fools game - just agree on an assert library. Added bonus is the error messages can be amazing (if the lib is good).

ThePadawan · on Feb 24, 2021

This approach works well in more dynamic languages with some level of introspection that can derive meaningful assertion failures from that code.

Coming from the C# world, the thoughtless use of Assert.* functions leads to the test failure of "Assertion failed: Expected true, got false." in 90% of cases.

lhorie · on Feb 24, 2021

The fluent API flavor is how jest works in javascript-land, and while it makes the code read ok, it makes for some lousy assertion failure messages.

Typically, you get some autogenerated variation of "expected 3, got 5" - which in a large codebase is kinda useless because you don't necessarily have context on what the error refers to without going back to the test source code to see where the failure originated from. This is especially problematic with table-based tests or tests with large numbers of assertions where the closest human-readable label might have been described too generically.

I much prefer to be able write out "foo broke in such such way: expected %s, got %s" than have the team rely on a machine attempting to generate a description and getting a bad description because the human is encouraged to be too lazy to write a proper description and the test framework is not smart enough to write a proper one for them.

efwfwef · on Feb 24, 2021

I'm not sure what you find ugly, actually I find your example ugly because it seems to be full of magic

nkozyra · on Feb 24, 2021

My answer would be: flexibility.

And I'd argue the alternative is more verbose since you're not invoking the error/fatal.

jrockway · on Feb 24, 2021

I feel like the downside is that those libraries are always one step behind the assertion you actually want. Our codebase at work uses something called "testify" for whatever reason, and it's always missing the things I want to test. For errors, it doesn't have any As/Is support; it just does an exact string match. That's all you get, so people are forced to write worse tests (or change the library, but guess what 99.9% of "line programmers" are going to do.)

If you demand your pet assertion, you have to maintain some extension on top of the library you use, or wait for upstream to add it -- leading to the maintainer having to maintain a 2000 function library of everyone's pet assertions, which isn't fun, and which they probably won't do.

Another problem with assertion libraries is that edge cases are buried in details far away from the test you're writing. What equality function does To.BeEqual() use? Is null equal to undefined? You really have to hope it doesn't matter. But if the assertion is the if condition, you're not adding any new semantics to your language; you know what == does, and the code says ==, so you don't have to guess.

Finally, error output suffers the more generic you make it. You can review a screenful of hand-written assertions, but autogenerated error output always seems to be one screen per error, meaning that you have to do more looking around to fix multiple tests. Minor complaint, but it's the nature of writing a formatter for interface{} vs. exactly what your test tests.

Overall I find it not worth it. Most of the time you'll be writing a table-driven test, and you're only writing the assertions a handful of times for tens of test cases. (A comment below complains that table-driven tests don't point you at the failure; giving you a vague error like "test 42: fail". I personally add a free-form "name" to every test case to avoid that particular problem.)

My main complaint with Go is how the official style guides don't make conventions around the hand-rolled assertions clear. At least when I was learning Go at Google, you pretty much had to use the form:

   if got, want := len(foobars), test.wantFoobarCount; got != want {
      t.Errorf("foobar count:\n  got: %v\n want: %v", got, want)
   }

or your code would not be approved. (The style points here are: the variables should always be called got and want, the output should be in got, want order, and the colons on got and want should be aligned.) Nobody in the real world does this, for whatever reason, and there is no "Code Review Comments" to point to to say why it should be got, want and not want, got. (Why? Because it's arbitrary and someone else already picked.)

q3k · on Feb 24, 2021

There’s some ‘got’ and ‘want’ pointers in https://github.com/golang/go/wiki/TestComments .

But yeah, I didn’t even realize this was a very Google thing :).

llimllib · on Feb 24, 2021

> it doesn't have any As/Is support

It does: https://pkg.go.dev/github.com/stretchr/testify@v1.7.0/assert...

slaymaker1907 · on Feb 23, 2021

I really hope this is a trend and randomized testing just becomes part of standard testing tools. I used jqwik extensively at my last job and it was very useful for finding issues with null on objects with lots of fields.

It's much easier to just describe how to construct data rather than coming up with test data and edge cases yourself. Sure you might have test cases for every nullable field, but do you have tests for every pair of nullable fields?

I also found some interesting behavior this way. StringUtils.trimToNull(str) == null is actually not the same as checking if a string is blank due to weird unicode peculiarities.

adamgordonbell · on Feb 24, 2021

Is there a clear dividing line between fuzzing tests and property based testing? They seem to use different words, but do similar things, like generate interesting input and making sure things still work when that interesting input is applied.

Is the difference instrumentation vs hand written generators and hand written invariants?

teeray · on Feb 24, 2021

Philosophically, I’ve found that property-based tests are more like “no matter what this sorting algorithm does, it absolutely should have the same number of elements before and after.” You’re so confident in that assertion that you’re willing to transcend specific test cases and subject the logic under test to limited randomness. Fuzz testing is more like a blunt instrument for finding tests that you should have written, by devising more and more creative input. There’s less intention behind it other than “this shouldn’t explode, and if it does I want to know.”

whateveracct · on Feb 24, 2021

The stretch goal of structured fizzing beyond []byte sounds like property testing to me

adamgordonbell · on Feb 24, 2021

Agree, it seems like the limit of one looks like the other and we should talk about them together. And Property testing has some of the same practical implications that Rob Pike mentions of fuzzing, can it take a long time and its not clear how to do it in CI without slowing things down (or at least it seems that way to me).

malandrew · on Feb 24, 2021

While completely unrelated to the new fuzz feature, one library I'd love to see added to the standard library is one for more standard units besides time, such as a distance library. It would be awesome to be able to do something like `foo := 100 * distance.Meters` which returns a distance.Distance type.

At the company I work at we do a lot of stuff with distances and time and with the time library it's a lot easy to have strong time typing throughout the codebase, but without a standard distance library, I've seen things in meters, kilometers, miles, etc.

With every new version they can introduce new unit libraries until there are ones for velocity, acceleration, area, volume, mass, temperature, etc. This would make golang a great option for many business uses and scientific uses.

The other thing golang should standardize in it's stdlib that is testing related is a clock libary like facebookgo/clock library for time control. It's already standard practice to have the rand library be injected with a seed so you can have determinism for tests. The same is need for controlling time for many use cases if you want determinism in tests involving time. Making this part of the stdlib and making it best practice to always dependency inject a clock instead of using a global like time.Now() would go a long way to improving more complex unit/integration testing where time is involved.

https://pkg.go.dev/github.com/facebookgo/clock

sitkack · on Feb 23, 2021

Does anyone have experience with Gopter, a Golang Property Based testing library? https://github.com/leanovate/gopter

motiejus · on Feb 23, 2021

I do.

I implemented a specialized json-patch and tested it with gopter.

My colleague implemented an equivalent of `openssl passwd` in go and used gopter to test it (comparing outputs between his implementation and what openssl returns).

Needless to say, gopter found subtle bugs in both cases.

I came to property-testing from PropEr (erlang), and was pleasantly surprised how well gopter worked for us.

nerpderp82 · on Feb 23, 2021

I would like to +1 jacque's request. I would also be really interested in a rundown of gopter.

jacques_chester · on Feb 23, 2021

> My colleague implemented an equivalent of `openssl passwd` in go and used gopter to test it (comparing outputs between his implementation and what openssl returns).

Could you talk more about this? I'd love to see the code or any other docs you might have.

motiejus · on Feb 23, 2021

It's a re-implementation of https://akkadia.org/drepper/SHA-crypt.txt in Go. Because Linux does not support shadow in bcrypt or scrypt. Also see https://access.redhat.com/articles/1519843.

Unfortunately, none of that is open source.

shakezula · on Feb 23, 2021

I like this idea a lot and I think I support adding it directly to the testing library. Fuzz testing has been pretty valuable to me in my time using it with Go and I would love to see it have more widespread support.

omginternets · on Feb 23, 2021

Agreed. GoFuzz has been a huge asset for me, but it occasionally suffers bugs due to the fact that it's hacking the runtime to some extent. IMHO that makes fuzzing an excellent candidate for inclusion in the standard toolchain.

lhorie · on Feb 23, 2021

At the risk of turning this into a Rust thread, is fuzzing less relevant in languages with more strict handling of nullables?

I can see it being very useful in go-land, since it's relatively easy to make mistakes w/ nils that the compiler won't complain about, but I'm curious how far having a stronger type system can go vs fuzzing.

steveklabnik · on Feb 24, 2021

Fuzzing is still very relevant in Rust. It tends to find panics rather than segfaults, but that's still bugs.

https://github.com/rust-fuzz/trophy-case

baby · on Feb 24, 2021

Just to add, we're a heavy user of fuzzing in Diem[1] and we found a good number of bugs thanks to it : )

https://github.com/diem/diem/