At some point, that simple framework won't do what you need it to, so it grows. And grows. Specificity and "clever"/"smart" frameworks are initially awesome to avoid boilerplate. But the more specific you make it, the quicker you hit these limits. The more clever you make it, the less comprehensible and more brittle the tests are. Lack of documentation is always a problem, as is predicting what such a framework needs to support. For the most part, it will just end up an unmaintained mess that someone down the line will have to throw out, and rewrite all the tests.
For acceptance testing, RobotFramework seems like overkill, and the syntax is weird. But it's a super effective generic testing framework, and suits the needs of thousands of projects. It's not perfect and you'll probably never love it, but it'll get the job done, no matter how big the project.
I agree with the bad state of C unit test frameworks. We had to figure out how to add unit testing to C code, and in the end we just went with Google Test and all the headache that C++ brings with it. That was a while ago. After a quick search, I think I'd give µnit a try first.
The reason I was even trying to build the code is to get the output `./tests` produces, so I could rant about how difficult it would be for me to integrate this into e.g. Jenkins with decent feedback of which tests fail.
IMO, teamwork is essential, and I'd rather not work with somebody who is happy to put the burden on other developers just so he can save a bit of time by rolling his own. Thanks but no thanks.
Jenkins integration is obviously not going to work as that was never the goal. I think CI is mostly a cover up for failing culture, an excuse to waste more time on ceremony.
I'm glad you got your rant.
For me, benchmarking isn't really useful unless it's a profiler in a real world use situation. Otherwise you have no idea how often each API function is called.
Super easy to use, header only so no need to compile anything beforehand, no external dependencies, reasonably fast compilation time, and does everything I need.
The fact it took over my programs' command line options and wouldn't give 'em back didn't help either. (Many of my tests are data-driven and use command line options to indicate where to read the data from - I already have a library to handle this! I didn't see - and still don't - why a test framework should be getting involved.)
I found using boost was a smaller hit to build times than Catch, and the suggestion of building catch in a separate translation unit didn't really help.
I'm guessing if your build times are already godawful you may not notice, but I sure as shit did.
Ultimate I found and went with dessert: https://github.com/r-lyeh/dessert
It's stupidly simple and doesn't do a lot of things that other C++ testing frameworks due, but it builds lightning quick and does 2 things I expect out of testing frameworks.
1. it tests, and
2. it reports failures.
I'll probably run into severe limitations that I can't deal with/work around, but for now I'm pretty happy with it and even if I end up using something else in 6 months, I'll still consider it worthwhile.
It has the drawback that one failing test aborts the whole program. The error messages are not the most informative, but it does the job.
assert(test_foo() /* Checks the sanity of Foo */);
I prefer the regular assert() though because I want the build to fail by an aborting test program, and all I care is the line number of failure. Another benefit is that abort() will pause gdb in the exact place that it crashed, with the stack intact if you want to inspect it.
assert(("test_foo() should be true", test_foo()));
assert(condition && "message");
Benchmark may be a different concern.. I know you can do something with rules but it may require more time to set up.
By the way: creating a small, self developed tool that just does what you need is sometimes a good idea, but it doesn't mean everything else is shit. Any library serving a large user base will serve somebody better than others.
Of course, that's why you use other tools like TestNG. Unit tests aren't _units_ if they're interdependent.
edit Looks like the author was trying to execute multiple overlapping test suites. Haven't tried that.
The website design does rather give the impression that it's done. I know that if I read the text, I learn that it's not, but my brain categorises those paras as marketing fluff and ignores them.
But it looks very much like what I want from Java testing. My views have, admittedly, been warped by RSpec and Jasmine.
When developing code, you'll almost always end up writing some ad-hoc testing code anyway: Generate some input data, call the code you're testing, printf() some results. At development time, you read this output to verify your code's working well. My system just makes this a bit more methodical and saves a 'known good' result - if the output changes then the code may be wrong. Once you've fixed the issue (or determined that the new behaviour is in fact correct) you update the known-good output.
Dead simple, basically maintains itself, still catches any problems that the tests would catch.
I've similarly rolled my own bash-based solutions, as testing is really as simple as running a script and getting a '0' or non-zero exit code.
I kept it very vanilla C and avoided having to maintain a suite structure in favour of a function call system which can be used on embedded systems without much resources.
I should probably do more to maintain that project and improve it as I have outstanding PRs it seems!
See https://github.com/DJMelksham/testy/tree/master if anyone is interested, though I'm not sure how useful it will be at this stage.
Like the article author, I settled on tags for tests rather than explicit hierarchies.
Also somewhat like the author, I care about benchmarks to a degree so amongst other things, each test keeps its latest timing and results stats: and because tests serialise to text/lisp code, each test-code/test-run can be version controlled with a project.
My project has some slight diversions/additions to the article:
1. Being able to capture previously evaluated code and returned results from the REPL and automatically package up the form and the result as a test. Obviously not as applicable to C, but i found myself interactively trying to get a function to work then capturing the result upon success as a regression test rather than the usual "design test up front" from TDD (although that's possible too).
2. I had a long long internal debate about the philosophy of fixtures. The concept of environments/fixtures in which you could run a series of tests was in fact my initial plan, and although I've backed out of it, I think I could theoretically add such a thing back in, but I'm not sure I want to now...
It seems to me by supplying fixtures/environments in which you run multiple tests, you gain "not paying the setup cost/don't repeat yourself" and "flexibility of running tests in arbitrary environments". What I considered the cost was the tendency to move away from true test independence (the benefit of which is easier naive multi-threading of running the test suite), and no longer a full referential transparency/documentation of a test. Test 2784 says it failed in the last run. Is that because I ran it in context A or context B...and what else was running with it and what caused the failure? What is the performance/timing actually measuring? Sort of like lexical scoping, I wanted the reason for the test's values and failures to be as much as possible "in the test source" and no where else.
This philosophy of mine creates obvious limitations: to try to get around them a bit, while keeping the benefits, I made two more design choices.
3. Composable tests: New tests can be defined as the combination of already existing tests.
4. Vectorise test functions!: Most important functions work on vectors or sets of tests as well as individual tests. The function (run-tests), for instance runs a vector of tests passed in to it. The functions (all-tests), (failed-tests), (passed-tests), (get-tests-from-tags), etc, all return vectors of tests you would expect from their names which can then be passed on to the other functions that map across vectors of tests. (print-stats) for example can print statistics on the entire test-suite (because by default its input is the result of the function (all-tests)), but it also accepts any arbitrary set of tests, be it passed-tests, failed-tests or a particular set marked by a specific tag. And because each test is self-contained, all results/reports can just be generated by just combining the internal results/reports of each individual test in the vector. Copy a set of old tests as a basis for new ones, compose multiple tests or functions into one, and/or map new settings or properties across an arbitrary group of tests.
Anyway, I'm curious to hear other people's experiences and design choices in this regard.