> MUST This word, or the terms "REQUIRED" or "SHALL", mean that the definition is an absolute requirement of the specification.
>SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
Thank you for posting this. At first, I thought your reply was an attempt at sarcasm—but on searching, I found the relevant RFC. Just shows that even after 30+ years of programming, there is so much that I don't know.
This is a common standard in aerospace requirements documents.
Join the world of needing to make the FAA happy and your vocabulary will change to incorporate shall, will, must, and others in a more discerning manner.
I'm doing exactly the same. Another added benefit of that is that you save some length in the test name, which can be quite helpful in certain scenarios.
Absolutely. I tell people this constantly. If you write a test that says “it should have length 256”, that’s an assertion on the design intention, not on the actual functioning of the code. “It should have length 256” == true as soon as the product guy says it should. “It has length 256” == true when I’ve written the code to make it have length 256.
“Shall” is another term I’ve seen used in larger orgs where teams literally craft test requirements as a gigantic database. “Shall” is for things that must be satisfied, “should” is for softer requirements and is rarely used.
This is why arbitrary rules are bad. 2/3 of your examples there would fail code review because they don’t start with ‘should’. Sure you could rename them to ‘should not handle zero’ but is that better? I would argue not.
Back when I was on a team that used it, I liked Doxygen because it would take your comment and drop it into a sentence, so if I provided "fluxes the capacitor", I'd see "TimeTravelingCar fluxes the capacitor." in the documentation. If I capitalized it, my documentation would have "TimeTravelingCar Fluxes the capacitor.", which looks wrong, so I learned not to capitalize. If I added punctuation at the end, it would read "TimeTravelingCar fluxes the capacitor..", so I learned not to do that.
It was a nice way to learn implicit rules in an applied manner.
(Sorry for the nonstandard punctuation, but it's necessary here.)
I will often use a concise test name, and drop a comment along side describing what the test is evaluating if it's not obvious. Works great, and I'm only constrained by my ability to write english.
Will also admit that I was a big DSL fan back in the ruby heyday. Works great if it's only you on the project but requiring a whole team to be DSL domain experts for something like testing is asking quite a lot.
Nothing in this article makes sense to me. What is this, Hungarian notation for tests?
What if a test is testing that something shouldn’t happen? Apostrophes aren’t even legal function names in most languages. (/s)
> If the only output of a failing test is just a binary value like “FAIL”, that test is only giving one bit of information to the developer.
Well that aaaand
> A good test framework will also print the test name and a call stack.
Ok so more than one bit of information, it literally tells you the line number and filename of the failed test so you can go look at it. But yeah maybe bad developer didn’t put any comments in the test and the code isn’t obvious but sure they’re gonna name the test using a magic template.
> For example, they could point out that “should replace children when updating instance”
It’s actually worse than that example suggests. Stuff like Expect(“type safety”).ShouldBe(GreaterThan(13)) throws runtime errors.
The semantics of parallel test runs weren’t defined anywhere the last time I checked.
Anyway, you’ll be thinking back fondly to the days of TestShouldReplaceChildrenWhenUpdatingInstance because now you need to write nested function calls like:
Context(“instances”, func …)
Describe(“that are being updated”, …)
Expect(“should replace children”, …)
And to invoke that from the command line, you need to write a regex against whatever undocumented and unprinted string it internally concatenates together to uniquely describe the test.
Also, they dump color codes to stdout without checking that they are writing to a terminal, so there will be line noise all over whatever automated test logs you produce, or if you pipe stdout to a file.
Oh yeah I’ve seen stuff like that, but I just figure it will collapse under its own weight and die. Of course not without wasting tens of thousands of developer hours. But not my developer hours.
It’s crazy isn’t it? People inventing a DSL to describe tests which you could just see by reading the friggin’ source code. It’s like ORMs for tests. Let’s create an abstraction (test DSL) over another abstraction (a friggin’ programming language) so now we have to understand two languages.
So of course the solution is to write tests for the tests. Which will require coverage analysis. To make sure the devs take it seriously, I f the tests on the tests don’t get 100% test coverage of the tests then the dice rolling app won’t build.
Oh no, please not again... In 2021 we had some of our workers do courses where this was said. The effect was that many tests with really bad names where just renamed from testXXX to shouldXXX. It took a very long time to get those workers to write meaningful names and get rid of this should in the beginning (our framework recognizes tests by a name starting with test. The renamed tests have not been executed since the renaming).
>You usually have at least a number of tests for each method you are testing, and you are usually testing a number of methods in one test fixture (since you usually have a test fixture per class tested). This leads to the requirement that your test should also reflect the name of the method begin tested, not only the requirements from it.
Ah, I get it now, thanks. I think I’d rather wrap the test method in an inner class named after the method-under-test, but not all testing frameworks allow for inner classes.
It took me a few seconds to parse what UnitOfWork meant in this context because that’s also a pattern for carrying e.g. database transactions through to different dependencies. E.g. in C#:
using var uow = UnitOfWorkContainer.Begin();
await SomeDep.Save(foo); // internally uses the uow's transaction
await SomeOtherDep.Save(foo2);
uow.Complete();
> It removes redundancy, because the function name should already be in the call stack.
I agree, as long as there is 1 methot to test per file containing the tests. Otherwise, I put the tested method name first (if it's a unit test) and adhere to the pattern [method]_[should]_[condition]_[result], e.g. Divide_Int_ByZero_FailsWithException()
I thought "should" was a product of silly TDD, like my widget "should do this thing" and it doesn't yet because TDD, and then the developer would implement it.
GIVEN X and Y SHOULD Z works, but don't you think that it lowers the meaning of the statement; essentially you're putting Y SHOULD Z in a conditional? Y SHOULD Z would work better, right?
> Some frameworks require that test names are formatted in some special way, like starting with “test”, using snake case, camel case or other. I’m ignoring that part of the naming for brevity and to avoid focusing on a specific language.
I was thinking when I first saw this I thought this depends on what testing framework you’re using.
Basically, what this is saying is to follow the conventions of the testing framework?
Or just, y'know, write a brief comment. Testing is one of those places where the function name occurs exactly once, perhaps twice if the language doesn't have a nice testing framework. It doesn't need to be self-documenting.
Some assert functions take an optional string that's printed to stderr or stdout when the condition fails. I've found that a more useful place to put the text that would go into a comment, because you can see the presumed "why" of the failure without looking at the source.
This information could go into the test function name, but you can be much more verbose in the freeform text. And it's guaranteed to move along with the assertion during a refactor, unlike comments that can get separated from the code.
That's confusing. In regular code, "ensure" normally means that the code is responsible for taking the necessary steps to make a condition true if it isn't already true. For example, ensuring that a buffer is large enough by growing it if needed. Or ensuring that some data is in the cache by loading it if not already present. Tests, on the other hand, do not ensure that the condition they're testing is met. At best, they ensure that someone notices if the condition is not met.
I see your point, but our tests ensure that out functions work as expected. usually when test function foo, the test is called test_foo. But if we have a function or class that will ensure that X is a valid sparrow, we have a test called ensure_valid_sparrow.
“Should” to me is too weak.
No, it’s not just “should”. It does. And if it doesn’t then either the implementation is broken, or the test is bad.
“Should” has connotations for me of something that we want but which we don’t require.
Therefore, I don’t use the word “should” in my test names.
Instead of “should have length 256”, I say “has length 256”.