
The Cucumber Test Trap - robin_reala
http://tooky.co.uk/the-cucumber-test-trap/
======
ShardPhoenix
I'm sick of all these software engineering articles which are so vague and
give no numbers, no hard evidence, and not even any very specific anecdotes.

~~~
midas007
Software architects... the people that don't code but seem to have enough ego
to spare to tell everyone else what to do.

~~~
mattgreenrocks
Sounds like a culture issue rather than a "software architect" (whatever that
title means) issue.

------
aslakhellesoy
Judging from the comments most people here _still_ think Cucumber is a testing
tool.

Why is it so hard for people to realise that Cucumber's main value proposition
is not testing, but to establish a shared understanding between various roles
on a software project?

~~~
taybin
Because it's just a higher level wrapper around a unittest framework? I have
no idea what you mean when talking about various roles. What roles?

~~~
aslakhellesoy
> What roles?

Roles of people on a software team. For example:

Business analysts, Product owners, UX folks, Programmers, Testers, End users,
etc.

I have described this in more detail here:
[https://cucumber.pro/blog/2014/03/03/the-worlds-most-
misunde...](https://cucumber.pro/blog/2014/03/03/the-worlds-most-
misunderstood-collaboration-tool.html)

------
programminggeek
I think people assume you need 100% coverage at every level of the testing
pyramid and so you have redundant tests. Also, with Ruby in particular, the
interfaces between objects and methods are so weak that you have no guarantee
that things will fit together and behave properly.

For example, in Ruby you can have a method with 2 parameters of X and Y. Maybe
X should be a string and maybe Y should be a float. You can unit test the
method to make sure it does the right things, but as soon as something else
calls that method there is no guarantee they will call it correctly or pass in
the right type of arguments. Thus, there are tons of potential integration
problems all over ruby code.

I think this lack of strong interfaces has trained ruby developers not to
trust unit tests and so they go down the path of integration and end to end
testing. Due to the weaker type and interface guarantees of the language, you
will write more tests to get the same level of confidence in your code. By
compensating for this language weakness, you are falling into "The Cucumber
Test Trap" or more generically you aren't following the testing pyramid
sensibly.

------
yeukhon
I don't program in Ruby. I write Python so Lettuce
([http://lettuce.it/](http://lettuce.it/)) is the obvious Python binding to
use. I've only used Lettuce once and I had an awful experience. That was a two
years ago.

I like writing tests though my experience with software development has been
limited to school projects, personal projects and intern projects. Nothing
major. Nothing at the scale of ecommerce or Google scale.

But one thing I learned from writing tests (unit tests, integration tests and
functional tests) is that do what is straightforward and don't worry about the
strict TDD cycle. I worry about logging and error messages.

I've spent a whole week learning Lettuce. At that time the research project I
was working on lacked tests (the project was using Django). I was told to look
up Lettuce because writing Lettuce tests is like writing stories.

My awful experience was trying to wrap my head around the framework, trying to
do many things that would take many steps and many tricks to complete. The
table was awful. I didn't like it.

I gave up. I turned back to the built-in unittests module. I wrote some tests
and from that point the only testing framework I have to add is the _mock_
library. When I need an isolated unit tests I would call _mock_. When I need
to test integration between modules without involving database calls or HTTP
requests I will use _mock_.

Testing should be straightforward and should remain simple. Maybe Cucumber
works nicely with Ruby's RSpec (I heard a lot about rspec and how rspec is so
awesome and helpful for testing, not sure if that's true since I am a Python
developer).

I also don't think writing a million test cases will work. In the end, while I
think unit tests and integration tests are useful and awesome, I like to catch
up with my functional tests. Most of my tests are done on the server side (so
just use _requests_ library to make HTTP calls). Depending on what kind of
application you are testing, the testing tools can vary.

In the end, keep your tests simple. If something breaks, you better fix it
quickly.

Logging , incidents collection and error messages are the most important thing
in deployment and testing. If I have to spend hours to trace down the bug (the
process died half way) or user is confused why the service is not working
properly, I will waste more cycles to build my product.

~~~
shanemhansen
I've had pretty bad experiences with mocking frameworks. I try to keep all my
tests straight-up python unittests. Forcing yourself to setup and tear down
classes and their dependencies in a few lines of code while also keeping the
ability to swap out dependencies improves my design by an order of magnitude.

I've had projects that were incredibly hard to change due to mocked out
database and network dependencies.

That being said, I've occasionally resorted to mock when I'm stuck with crappy
code that needs to be put under test before refactoring.

~~~
yeukhon
I am okay with mock. I was an early adopter of mock so I had the same
struggle. Well, I agree there is a big learning curve using mock. When I mock
I get lazy I mock a lot of things out.

If the project has to spawn processes or do a lot crazy network stuff I just
forget about mocking and true unittest. I go straight with functional tests.

------
slavik81
I've never used Cucumber, but a lot of this article seems very familiar. I was
working in a C++ desktop app, and we had similar problems with our tests. Some
developers insisted on thoroughness and would write massive sets of end-to-end
tests (in addition to unit tests). Our test suite was rapidly becoming slow
and unwieldy. After about 8 months of development, we had around 1000 end-to-
end tests, which took an hour to run. That was already painful, but we feared
what it would be in a couple years if we kept adding tests at that rate.

My solution was to abstract away the things that made it slow and swap in
faster components for the tests. Network connections were replaced with in-
memory buffers, external programs were pulled into the same executable, and
timers were simulated. Bam. The tests compiled in minutes and ran in seconds.
People actually ran and listened to the tests because they were fast and
reliable.

Of course, I was then left with no real end-to-end automated tests. The
'almost end-to-end' tests I had covered most of the software, though, and
manual testing was used to catch the rare bug that would slip through. My
conclusion was similar to that of the article: if you were going to do real
end-to-end testing, you couldn't write very many tests.

------
jdmitch
_Avoiding the cucumber test trap is hard. It’s easy to keep adding scenarios
which give you a false confidence that your application is working correctly.
It’s easy to just add some more code to make those scenarios pass._

 _Instead we need to keep focusing on the conversations. Find the scenarios
that matter, that are important to document, that are worth automating and
push everything else down into lower level, isolated tests._

How then do you prioritise which tests are actually important to document, or
better still edit out the ones that aren't? It seems like it would be
difficult to be consistent.

~~~
peteretep
Agree and write the Cucumber tests when you write your stories, in the same
meeting. Agree to write the minimum test that describes the feature in a way
that everyone agrees on and understands. Don't rely on Cucumber to test all
your edge-cases.

~~~
tooky
This.

BDD is about the conversations - getting people together with a business
mindset, a dev mindset and a tester mindset to discuss what needs to be built.

From that discussion you get a good idea of the business requirements - and
hopefully some examples of how they should play out.

The person with the business mindset will have some of those examples that are
important for them to be automated as business facing acceptance tests. The
one's you write in gherkin.

The person with the testing mindset will have a bunch of ideas about how you
can break this. Some of those will be important enough that they should become
business facing tests, but often the group can decide that these would be
better tested by unit tests.

We call this meeting the 3 amigos.

------
jdlshore
I was heavily involved with the Fit project for a while. (Fit is Ward
Cunningham's predecessor to Cucumber. It used HTML tables rather than ASCII
sentences.)

I find it interesting, if not surprising, that the Cucumber community is
discovering exactly the same issues that we did with Fit: namely, that it
encourages brittle integration tests, and that people don't use it for its
intended purpose of collobaration.

I've come to believe that these problems are unsolvable.

I worked on and promoted Fit for several years. Eventually, after some deep
soul searching, I concluded that Fit (and by extension, Cucumber) is _solving
the wrong problem_. The value of Fit (and Cucumber) comes from _discussions
with domain experts_. The tools encourage a certain amount of rigor, which is
good, but you can get that rigor just as easily by discussing concrete
examples at a whiteboard.

The actual automation of those examples, which is where Fit and Cucumber focus
your attention, has minimal value. In some cases, the tools have negative
value. Their tests are fundamentally more complex and less refactorable than
ordinary xUnit tests. If you want to automate the examples, you're better off
doing it in a programmer tool.

Some people got a lot of value out of Fit, and I'm sure the same is true for
Cucumber. They got that value by using it for collaboration and focusing on
domain rules rather than end-to-end scenarios. My experience, though, was that
the vast majority used it poorly. When a tool is used so badly, so widely, you
have to start questioning whether the problem is in the tool itself.

Ward and I ended up retiring Fit [1]. I've written about my reasons for
abandoning it before [2] [3].

[1] [http://www.hanselminutes.com/151/fit-is-dead-long-live-
fitne...](http://www.hanselminutes.com/151/fit-is-dead-long-live-fitnesse-
with-ward-cunningham-and-james-shore)

[2] [http://www.jamesshore.com/Blog/The-Problems-With-
Acceptance-...](http://www.jamesshore.com/Blog/The-Problems-With-Acceptance-
Testing.html)

[3] [http://www.jamesshore.com/Blog/Acceptance-Testing-
Revisited....](http://www.jamesshore.com/Blog/Acceptance-Testing-
Revisited.html)

~~~
aslakhellesoy
I completely agree with you that the most important value with BDD and
Specification by Example happens during conversations at a whiteboard.

Cucumber/Fit/high level tests aren't fundamentally less refactorable. They
have a looser coupling to code than unit tests, so when the production code is
refactored they are less affected than xUnit tests.

The problem most people make is that they are trying to cover too much with
Cucumber. And they use it _solely_ as an automation tool without any
collaboration whatsoever. That doesn't work, but it's difficult to get that
message across to people.

In order for Cucumber and Specification by Example to be used well it is
essential to provide tooling that appeals to the most critical piece of the
collaboration puzzle: The business analysts (or the people with the
requirements).

Nobody has solved this problem yet, but I think it is a solvable problem.
We're trying to solve it with Cucumber Pro:
[https://cucumber.pro](https://cucumber.pro)

Let's see how that goes.

~~~
jdlshore
Looks promising! I wish you the best of luck with it.

------
cryptos
The author of the article should read the book "Specification by Example"
([http://specificationbyexample.com/](http://specificationbyexample.com/)).

It is absurd to complain about cucumber, if cucumber is used for things it
wasn't intended to do.

