Hacker News new | past | comments | ask | show | jobs | submit login
Why Bother With Cucumber Testing? (jackkinsella.ie)
80 points by gcv on April 7, 2014 | hide | past | favorite | 29 comments

Creator of Cucumber here.

Cucumber is not a testing tool [1], it is a collaboration and analysis tool.

Use it to document (in very broad strokes) the features of your system.

It was never meant to be used as an exhaustive regression testing tool, or to replace unit testing.

[1] https://cucumber.pro/blog/2014/03/03/the-worlds-most-misunde...

I'm curious what you think about 'Relish', software for turning cucumber steps into developer documentation. Ruby projects vcr[1] and rspec[2] use it as their exclusive/preferred/canonical developer API documentation.

As a developer-user (not maintainer/creator) of these projects, I've always found their relish-produced api docs to be fairly insufficient, confusing, and frustrating for me as a developer needing API docs.

[1] https://www.relishapp.com/vcr/vcr/docs

[2] https://relishapp.com/rspec/docs

My comment was a bit too late to be seen last time [0], so I'll repeat it here:


I was heavily involved with the Fit project for a while. (Fit is Ward Cunningham's predecessor to Cucumber. It used HTML tables rather than ASCII sentences.)

I find it interesting, if not surprising, that the Cucumber community is discovering exactly the same issues that we did with Fit: namely, that it encourages brittle integration tests, and that people don't use it for its intended purpose of collobaration.

I've come to believe that these problems are unsolvable.

I worked on and promoted Fit for several years. Eventually, after some deep soul searching, I concluded that Fit (and by extension, Cucumber) is solving the wrong problem. The value of Fit (and Cucumber) comes from discussions with domain experts. The tools encourage a certain amount of rigor, which is good, but you can get that rigor just as easily by discussing concrete examples at a whiteboard.

The actual automation of those examples, which is where Fit and Cucumber focus your attention, has minimal value. In some cases, the tools have negative value. Their tests are fundamentally more complex and less refactorable than ordinary xUnit tests. If you want to automate the examples, you're better off doing it in a programmer tool.

Some people got a lot of value out of Fit, and I'm sure the same is true for Cucumber. They got that value by using it for collaboration and focusing on domain rules rather than end-to-end scenarios. My experience, though, was that the vast majority used it poorly. When a tool is used so badly, so widely, you have to start questioning whether the problem is in the tool itself.

Ward and I ended up retiring Fit [1]. I've written about my reasons for abandoning it before [2] [3].

[0] https://news.ycombinator.com/item?id=7514651

[1] http://www.hanselminutes.com/151/fit-is-dead-long-live-fitne...

[2] http://www.jamesshore.com/Blog/The-Problems-With-Acceptance-...

[3] http://www.jamesshore.com/Blog/Acceptance-Testing-Revisited....

> I've come to believe that these problems are unsolvable.

"Unsolvable" is strong. My take is that these problems stem inevitably from approaching the goal ass-backwards.

You're trying to force a particular testing framework on people who don't care about frameworks, supposedly in the name of "better communication". There's no reason to expect that to work.

The way I've advocated doing it, for years now, is to first sit with the people in question, discover how they communicate about business goals that the programming effort is expected to assist, formalize their notation as little as you can get away with and use that for acceptance testing.

This should be a process of active listening, not passive recording. The client should be gently nudged away from speaking in solution-terms, for instance.

> My take is that these problems stem inevitably from approaching the goal ass-backwards.

Yes, and that's the problem that's unsolvable. (Or very, very difficult.) I'm not saying people can't figure out how to use tools like Fit or Cucumber well; I'm saying that, for a lot of complicated human reasons, they don't. And I don't think there's anything a tool can do to change that.

The reason is that using Fit (or Cucumber) well requires that you change the way that you interact with your business-oriented colleagues. That's hard. It's not a technical problem; it's a social problem, and it needs face-to-face politicking. My experience is that teams that turn to tools first are even less likely to be able to solve these problems than those that don't. Many of them don't even understand that the problem exists.

In order for a tool to solve this problem, it would have to force—or strongly guide—the "proper" workflow and be highly desirable for everyone involved. Aslak & company are working on https://cucumber.pro/, which is a valiant attempt; it will be interesting to see if it works out.

I agree, unsolvable is not what I believe. I agree, what we have to learn is that is a frequent problem with organizational systems we create. I think by thinking about the problems we can create systems that work, but it requires some real focus - attempts that don't provide some real focus on the areas likely to create problems are likely to fail.

I think it is due to some systemic factors in most of our organizations so it often requires changes beyond some simple change. A huge key is what you mention the communication between "business owners" and the development team. You also need to stop the shouting down of people who say the new attempt won't work as being negative. Given the results around the globe there is lots of evidence it won't - we need to delve into why those who are criticizing the new hope think it won't work and try to adopt improvements to make it more likely to work.

I should mention that Aslak Hellesoy (creator of Cucumber) responded with a well thought-out comment about the issues I raised above: https://news.ycombinator.com/item?id=7519369

> I find it interesting, if not surprising, that the Cucumber community is discovering exactly the same issues that we did with Fit: namely, that it encourages brittle integration tests, and that people don't use it for its intended purpose of collobaration.

> I've come to believe that these problems are unsolvable.

I think their quite solvable, its just that they aren't technical problems, they are social problems, and technical solutions fail, and the people that are good at solving technical problems often aren't particularly skilled at solving social problems.

I think Cucumber could be a good tool in the right workflow, but establishing the right cross-functional workflow is a very hard social problem that the people that understand the technology very often don't have the skill to solve (and, generally, don't have the social position to effectively champion solutions even if they had them.)

I don't think it's unsolvable either.

I'm often asked to come in and help teams who want to use Cucumber, because they want to collaborate better. We train them in collaboration techniques like the 3-amigos as well as reminding (brainwashing?) them about the importance of conversations and ubiquitous language. We teach them how to use Cucumber for automation too, but these visits give us a chance to do much more than that.

The only time I have found cucumber testing useful is when I was working in a "corporate" scenario where we had tedious UAT tests written for humans to run manually.

I convinced them to use plain-text cucumber syntax files. We had to use detailed imperative, not declarative, tests with the default steps which all cucumber gurus hate, and in fact were relegated to a "cucumber-training-wheels" repo. But in this scenario they were the only option. Old school Enterprise test thinking) then I just automated them myself and made them part of the CI build.

The result, 100% pass on the first UAT run (minus a few % from UAT testers reading tests differently to how everyone else had been reading them up to that point), everyone happy, we look good and client hired contract testers finishing early and saving money.

Without cucumber we would have been in excel spreadsheet "UAT Test file" hell.

I am not sure I would use them in any other scenario though.

I used cucumber in the same type of scenario (humans running tests) and it worked great. It works so well it's really tempting to try and use it everywhere, but there are definitely shortcomings when cucumber gets misapplied.

I've never quite understood exactly whom Cucumber is for. Clients don't want to read Cucumber specs and programmers don't want to write them.

> Clients don't want to read Cucumber specs and programmers don't want to write them.

Perhaps it is a general problem? Clients want to stay on their end of the (in-)formality spectrum (narrative) and programmers on theirs (formal spec.) Nobody wants to go the middle.

Yeah, there are analysts. They bridge the gap but are themselves an additional link where transcription errors happen.

Another problem is sometimes work requires people to do things they don't like. Sure seek solutions that make everyone happy with what they are tasked to do, because it is nice and also more likely to work. But if part of someone's responsibility requires doing some stuff they don't like, make sure they do it (give them the opportunity to adjust things to accomplish the needs in a way they like better but just not doing stuff you don't like is not acceptable).

I understand in this case people can say they don't like it and it isn't something they should have to do. Well that is another issue. Why have people do stuff (wether they like it or not) that isn't providing value to the organization.

>Clients don't want to read Cucumber specs

Far too broad a generalization I think. I currently work with a non-technical CEO whose first encounter with Cucumber was when we were trying to re-specify a feature that we'd got wrong at the previous attempt. It was a revelation to him, and his initial burst of enthusiasm resulted in his going away and writing - unknown to us - Cucumber Features for about half of the functionality of the app. He's calmed down since then and leaves the actual writing to us, but Cucumber remains our go-to tool for resolving complicated requirements.

Obviously trying to force Cucumber on an unwilling/uninterested client is going be difficult and damaging to the relationship. Maybe if their first experience was a positive one in which using Cucumber helped resolve a communications problem rather than a technical ritual that they were forced to repeat for every user story, then attitudes might change?

Most points raised by the auther are criticism with specific Cucumber step implementations. I understand that this is not criticism of Cucumber-the-tool but Cucumber-the-process, but of course, if you're using your tools wrong, nobody's going to just fix your process.

The awkward step naming, and the routing issue, are all with the default steps of an integration of something like Webrat or Capybara. In recent projects (and I think this is the default for newer versions of Capybara), no steps are automatically generated for you. You have to choose the level of detail you want to operate at yourself. Comparing character count of a method call and a Gherkin step is also a rather useless metric.

The "doesn't share code with my test env" issue is a trivial fix: move your test-but-also-cucumber-env code from test_helper.rb into another file, then require it from both cucumber's env.rb and test_helper.rb.

Personally, I write Cucumber features exactly because they mean I don't have to think about routing, or paths, or syntax. I try to put myself in the role of the user and write down what I want to accomplish, and how. A key indicator for reasonably abstracted feature files might be the equivalence "I changed something in a .feature file" IFF "I have to communicate a change to my users."

But if you want to shoot yourself in the foot, you really have to do it yourself.

In Python, I use doctests for things like this. Just write the features in plain understandable language, with clear code that serves both as integration/behaviour test and example in between.

Cucumber etc. is usually about coaxing natural language into code, which makes no sense to me. Better to use natural language with useful code examples, which are readable to client, developer, and test software.

I agree with most of the problems OP describes, but find the article lacking two important things.

It is always easy to say "don't do X". But without explaining how that can be replaced with- or ported to an alternative, it is a quite hollow advice.

For me, Cucumber is not just the Gherkin syntax. It is, first and for all, a turnkey setup that allows me to have browser interaction (including selenium), organised in a nice and workable way.

How I would implement something like:

  When "I fill in my payment details" do
    fill_in "Creditcard Number", "1111111222"
    fill_in "ccv", "1234"
In, say, Minitest or MinitestSpec is beyond me.

Are there good resources on doing proper integration tests as OP suggests? There are entire books on Cucumber, yet I cannot find something similar on Minitest.

Is it easy to set up a toolchain that allows me to use Capybara and Selenium (phantom.js) interchangeble in my integration tests? If so, is there a place where I can find documentation on that?

Here's an entirely working example I just created: https://gist.github.com/benolee/ca49aaa9a0363c18904b

Capybara does pretty much what you are asking for:


And it has various drivers for e.g. Selenium.

So it seems like the new hot way to do integration testing with RSpec + Features [1]. Any sources on doing this well? I find myself writing methods that read like cucumber.

[1] http://pivotallabs.com/getting-by-with-rspec-feature-specs/

In my frontend web team, we used Cucumber for over a year. We slowly came to the same conclusion - Cucumber tests are hard to maintain, run more slowly, and overall take more effort to develop.

As of today, we're in the process of ripping out all of our Cucumber tests and replacing them all with RSpec features.

In our largish company we found that product owners generally didn't care about reading Cucumber features. The definition of the products and how they works are defined in the agile boards and cards we write together, along with documentation and wireframes which live on our intranet.

I've done integration testing just about every way I can think of. I've "cuked it wrong". I've used plain rspec. I've used rspec with features. I've written an equivalent framework for use in a test::unit shop.

They're all fine. They get the job done.

I've also done acceptance testing with all of the above, and it works great as well. It's a language thing. My AT's are rarely more than half a dozen lines, and many of them are less than 5.

Whether or not you should do ATs at all really comes down to the process culture. If you have someone signing off on whether a feature works, I think it's great to do. If you're the only one signing off, I think it's worth doing as a tool to help implement the feature, but then convert it to an integration test, trim the AT to be part of a very limited suite, or toss it altogether. They're not unit tests, and not meant to be voluminous documentation of your app. That doesn't mean don't do them. It means learn how to do them well.

I do this with RSpec + PageObject: https://github.com/cheezy/page-object

An example that's similar to what I do: http://youtu.be/e9tfC-gLW8c?t=8m28s

I wouldn't say I'm doing it well, but I'm doing it. I resolved to make myself do TDD for a few projects and today signed up for a month of Thoughtbots Learn program. They're big advocates of integration spec testing, and have written quite a bit about it.

I'm not wholly convinced its the best way to go, but it does offer some advantages.

I use both RSpec and Cucumber often, and easiness wise, I think Cucumber is just easier to write.

RSpec verbs and scoping still messes with my head, and I always have to debug why the tests are breaking.

Cucumber, especially for moving-fast-and-break-things startups, can be extra baggage. It's only needed where there is so much behavior and/or stakeholders to please that it would matter. Otherwise, it's probably as useful as repetitive, boilerplate comments that bloat LoC without adding value.

But differentialy, if you want to go down the waterfall route, especially for safety / control systems, formal specs can be really, really useful (Z "Zed" for example). It looks like model descriptions for MVC, but it's closer to being readable formal proofs... so it's very useful for engineering / scientific projects. [0]

IOW... What I remind myself: "Stop playing with tech toys, ship and get some customer feedback already."

[0] https://en.wikipedia.org/wiki/Z_notation

This mirrors my own experiences using Gherkin syntax in other tech stacks. The most workable scenario I've come up with is to focus on simple, terse syntax that is still readable.

Build reusable page objects, keep your actual tests short and things get nice and easy.

Lots of good options in this space. A few years ago I built FluentAutomation [1] to solve this issue in the .NET community.

[1] http://fluent.stirno.com/

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact