That said, almost every rails project I've worked on has had the cucumber gem included at some point, just not actively used.
IMHO the vocal Rails community puts too much emphasis on testing. I write tests for my code, but not as many as some would like. The RSpec book in particular is a gigantic load of crap. And I say this as a fan of comprehensive tests! The ridiculous lengths that book goes through in testing that codebreaker game are so detached from reality it makes my head hurt.
Cucumber tries to solve the problem of turning customer requirements into 'real code'. In exchange for that worthwhile benefit, it asks you to implement the most terrible, reg-ex based spaghetti code imaginable.
The problem is that it doesn't solve the original problem AT ALL. And then you are left with terrible reg-ex driven spaghetti code. Like the Jamie Zawinski saying, "now you have two problems".
The lesson here is that software development processes have to pass the 'human nature' test.
The software industry has largely abandoned waterfall development because it just doesn't work well in practice. It doesn't work because people don't know perfectly what they want before they build it. Agile processes usually are much more efficient because they are more closely aligned to how humans solve problems in the real world.
Cucumber suffers from the same issue of being disconnected with reality. In theory, you can find a customer who can give you perfectly written use cases and you can cut-and-paste those into your cukes. In practice, that never, ever works. So let's all stop wasting our time pretending it was a good idea now that it has been shown to not work.
No, it tries to solve the problem of narrowing the gap between objective, easily reusable tests and customer-understandable requirements. Insofar as it involves code, "real" or otherwise, that's a means to solving the problem, not the fundamental problem its trying to solve.
> In theory, you can find a customer who can give you perfectly written use cases and you can cut-and-paste those into your cukes.
The theory that "customers" on their own do this is flawed (and seems to be part of the we-don't-need-no-stinking-analysts school of software development theory); that's not a problem with cucumber or tools, its a problem with not having system/business analysts (which term is preferred depends on the environment, but they are the same thing) who work with customers to elicit requirements that the customers own and can validate but which the analysts helps them to shape into the needed structure.
This does take time to create, but in my experience having acceptance tests written in a form which is readable to anyone is very useful. I've even gone so far as to create a gem to display the features as the 'help' section of a website:
What's wrong with `rspec -f doc`?
I don't know Cucumber at all but I find it fascinating how this software seems join a long list of hopeful but discarded "human readable coding" systems - both Cobol and SQL were touted back in the day as ways non-programmers could write code and consensus seems to be that the human-readable part just made things more difficult on the balance.
Perhaps, but I think more importantly it's just a time sink. You end up writing nearly all of the same test code, only reusing bits becomes more difficult thanks to the abstraction layer.
I've only suggest cucumber is extreme cases, and even then with caution, and not before mentioning Spinach as an alternative. https://github.com/codegram/spinach
Spinach operates with plain old ruby objects. They fix the two weakness of cucumber: Step maintainability & Step reusability.
That said, I seldomly use either, because I can easily write the same under one house with RSpec.
Over testing can be as bad as not testing at all.
The primary heuristic I look for is "Do I have enough tests that I can be reasonably sure these tests will fail if I break the code". That is the fundamental question! This is different than the RSpec books' approach which is more dogmatic and highly structured (you need to test every function, every entry point, pretty much every single thing). The thing is, the RSpec book approach does work! It works better in fact! It also is a colossal waste of time. I'd write tests like that if I was writing a trading platform for the NYSE, but almost no one does that sort of work.
Since I'm in the midst of writing a book (shameless plug http://exploringelasticsearch.com ) I don't have time to writ e that post unfortunately.
The point of BDD isn't to "stop breaking the code". I think "coders" value "code" too much.
The point of BDD is to build understanding of the problem you're solving by enumerating behavioral ambiguities/edge cases up front, before you've invested time and emotion into a particular code approach.
You're using testable _behavior_ to drive the development of your code. More simply - test first, code second. Having a regression suite is a cool side effect.
Of course, this is just how it's "supposed to work." In real life YMMV. Whether or not you think there's much value in that methodology is another story - but if you are going to dismiss something, you have to dismiss it based on a relevant metric.
See http://ravimohan.blogspot.com/2007/04/learning-from-sudoku-s... for an interesting example.
Basically, test the heck out of your domain model layer, and then do full stack acceptance tests above that to simulate the user using the app. If it's a web app use something like Capybara, if its a terminal app or some other interface, simulate that appropriately with some other kind of integration test.
...who thought this was a good idea? (Sorry, being harsh.)
That said, I think the idea is conceptually fine, and, as usual, I think could actually be fairly pleasant with the right (static) tooling applied, e.g. ThoughtWorks's Twist:
Which is a GUI that provides all the nicities of "what commands are allowed", "what params do they take", "oh, refactor this command name from abc to xyz", etc.
I think PM/testing types would love this sort of setup. Granted, it would still take dev investment to get the fixtures/commands/etc. setup. But if you have a huge line-of-business app that will drive a business for the next 5-10 years, I think that's a good investment.
Disclaimer: I've never actually used Twist because it's a commercial product. Yes, I know I suck.
Having said that, I did work a gig where the tester would sit with the subject matter experts, and as they would talk, he would capture what they were saying as Cucumber tests, and he would then echo the tests back to them to see if they were right. Afterwards, he would then add whatever other tests he needed (checking weird edge cases and such), and code up any additional fixtures he needed. Then it was just a matter of me getting all the tests to pass. It was a really nice way to work - I knew unambiguously when I was done with something. (It helped that the tester sat right next to me.)
Cucumber is like any tool; if used correctly, it's great, and if it's used as a de facto tool, then you're going to struggle with it.
Given a person "fred" exists
And a person "ethel" exists
And a fatherhood exists with parent: person "fred", child: person "ethel"
Given Ethel has a father called Fred
Borne o'er the cruel firmament become Ethel
That Fred hath usurp'st from her the silence
When lo! Upon them should harrow a fatherhood
Of parent, Fred besieged by the burden
And his daughter the indifferent imposition of child
The problem isn't with Cucumber. The problem is your cukes suck. This is probably also a problem in your Rspec/Minitest/whatever tests. If you're using cucumber the way 90% of the cucumber tests I've ever seen (indeed, a great many i've written myself) are written, then you're writing integration tests, and probably crappy ones at that.
Given a user
And I go to some really cool page
And I click on some button
Write actual domain tests for your acceptance tests. Do this regardless of your framework—there's absolutely nothing preventing you from doing it rspec or anything else. Fuck step reuse (gherkin alternatives like spinach are great, specifically because the steps are isolated to the test.)
If you don't want to write acceptance tests, and you feel comfortable with that, no biggie. Lots of people don't. But don't create a straw man out of Cucumber just because you don't understand layers of testing.
This goes for Cucumber fanboys as well. Don't push cucumber into layers it doesn't belong. Its a domain testing tool. Not an integration test. definitely not a unit test. Don't push it on a team that doesn't want it. You can write the same kinds of tests in rspec. Disclaimer: I like cucumber. I am not using it in my current job because it wasn't a good fit for the company. I will probably use it again one day. I still write AT's.
Given I hate cucumber
And I post a scree against it
Then the haters will rejoice
describe "hating on cucumber" do
it "produces a response" do
The argument isn't som much that Cucumber doesn't suck, but people are using it wrong, but that Cucumber does suck for what people are complaining about it sucking for, but that Cucumber isn't designed or promoted for that use.
Its "yes, hammers suck for connecting things with machine screws, but that's not what they're for", not "hammers don't really suck for connecting things with machine screws, its just most people are holding the hammer wrong when they try to do that".
Its not my hammer (or that of most of the people, I would imagine, posting in a similar vein), and I'm not interested in fault.
> Someone put that idea there
Yes, people have put out bad ideas of what Cucumber is for -- and the people complaining about it not being good for things it isn't designed for are among those people.
That's why some of us are countering those incorrect ideas, so they don't keep spreading.
> you as the creator of this not-a-screwdriver are responsible for the failure of that message.
Wrong ideas that only some subset of the people exposed to a product get about its use are not solely the responsibility of the creator of the product. Obviously, if you have an interest in selling the product, you are the person that stands to gain or lose based on those wrong ideas and you have a particular interest in correcting it and a responsibility to your business to take any efficient steps to combat that misimpression, but then again, if you are using tools in your business, you likewise have a particular interest in identifying and correcting your own mistaken ideas about tools and a responsibility to your business in taking any efficient steps to correct misimpressions you may have.
I've seen people try and fail to use Cucumber effectively.
Now, Cucumber is a really impressive implementation, but it's a fundamentally flawed idea. You cannot pivot your requirements from business talk "As a blah I want X" into functional tests without a lot of shear.
You may as well write your test descriptions in Farsi or Russian for a team that only speaks English. They make no sense.
Functional requirements should set the groundwork, but they shouldn't serve as the template for construction. You convert these requirements into a format and language understood by developers. If you can preserve some kind of mapping between the original requirement and the myriad of things that had to be implemented to make that feature work, you've done something amazing.
Sadly, Cucumber doesn't let you do that.
The damn example you posted at the end of an RSpec test that has a bunch of defined methods IS EXACTLY WHAT THE AUTHOR IS SAYING!
"Cucumber is just a way to wrap RSpec tests with a non-technical syntax. Any supposed benefits of it go to waste because code is only interesting to those who are working in it. Quit writing cukes unless you can honestly say that there is someone reading them who would not understand pure Ruby."
Who exactly are you arguing against?
Also, as contrived as his examples are, he still managed to illustrate exactly the problem inherent in most of the cukes i've ever seen: they're about clicking buttons and navigating a browser. That's not domain. (To be fair, the cucumber team didn't do themselves any favors there by adding the web_steps.rb to their installation process way back when—its long gone now but the legacy lives on.)
Its far from the worst integration testing tool available, and I'm not really sure that there is a big difference in the requirements for an integration testing tool vs. a domain/acceptance tool; certainly, the ideal usage patterns into which the tools are embedded for those uses (including who should be writing tests: if you are trying to involve customers in test design, but your trying to do integration-style tests, that is likely to fail hard independently of tooling.)
There's an advantage to this as well, if you practice something that approximates DDD: you can use the same test to hit your domain library as your user interface. The step implementations are different, obviously, but the gherkin test is the same.
Also, I definitely agree with those who have stated elsewhere that customer collaboration is a red herring. I've guided customer discussions with cukes before (e.g. talking about edge cases and aspects they haven't thought about), but never has customers writing cukes been a good experience.
I agree that there is a big difference (I'd describe it differently because I don't think of ITs as being about code, I think of UT as being about code, ITs as being about architecture, and ATs as being about domain concepts, but that's, arguably, quibbling); I just think its a difference that often manifests more in terms of the organization of the who-does-what-and-how outside of the tool but which, between ITs and ATs, doesn't necessarily make a huge difference in desirable tool features. (Particularly, I think than non-code language for tests can have value for ITs, though the style of the language and its audience is different than for ATs.)
Cucumber helps techies (devs, testers, etc.) and non-techies (product owners, scrum masters etc.) work collaboratively. I don't think the authors of Cucumber ever said that it enables non-technical people to 'read and understand the underlying code'.
Ultimately though, I agree with your conclusion but for slightly different reasons. I use both RSpec and Cucumber daily; they are both awesome and provide a similar end result. I enjoy writing Cucumber features but in certain circumstances it doesn't scale for large/complex applications (in my experience). I think that this isn't a problem for many people though, unless they're doing something wrong 'under the hood'.
You may argue that that is silly, but there is some sort of reason.
(None for Sinatra, though. "He's so classy he deserves a web framework written after him.")
When I first started using Cucumber I thought it was cool that "anyone could read the test and understand it"
But that one advantage doesn't really matter that much when everybody in your project knows Ruby and doesn't need the natural language side of it.
Well, yeah, if people who aren't Ruby coders aren't reading your tests, Cucumber is overboard: the motivating use case for Cucumber is acceptance tests where the main part of the test (the part that isn't implemented in Ruby code) is owned by and validated by the customer/user, who presumably isn't generally, except by coincidence, a Ruby coder.
It might also be useful for integration tests where that need to be validated by people familiar with the overall system design/architecture, but not necessarily the language that any particular components are implemented in, for the same reason.
I use Cucumber on my own projects when it's just me, for precisely this reason.
I'd recommend the Cucumber Book if you want to read about how to use it well:
I've also blogged a lot about how to use Cucumber well here:
Those are different kinds of tests. Go write them in a unit or a spec or whatever you feel like. Code coverage - covering your ass during refactors - and integration - covering your ass before deployments - are just for you and can skip the regexp.
Gherkin is for collaborating with my customer and making them tell me what the fuck they actually want. 90% of the value happens before any of the steps are ever implemented. A few user interactions will be covered, likely not even all of the possible ones. Little step reuse as focus is on readability.
It's a really specific problem, and so far I haven't seen a superior option aside from Gherkin.
At my last company we had QA write all the cucumber tests and someone would hook in the new statements, which is the overhead you suggest. Now if you write 3 different gherkin statements that do the same thing, then that is not optimal.
Like many problems, I don't think the tool is at fault. It does what it claims. Provides human readable syntax for test cases and lets you hook that in however you want.
I think it's a valid point to think that extra layer is unnecessary, but I wouldn't go as far as discounting it and saying that no company has derived value from it.
Like all code, the messes usually stem from how the code is implemented, not the language itself.
It's a laudable goal, but the abstraction is so leaky in reality it just ends up creating more work for everyone.
Since capybara is so awesome, it gave a lot of people a nice impression of cucumber.
But unless you actually have a customer in the loop writing cucumber tests, just use the capybara directly.
I just TDD'ly implemented a "user signs up for Direct Deposit" feature, and the acceptance tests look like this:
describe 'Direct Deposit' do
@workflow = DirectDepositWorkflow.new(create(:user))
describe 'User sets up direct deposit with correct info.', :js do
it 'user sees confirmation page - individual' do
class DirectDepositWorkflow < Struct.new(:user)
fill_in 'name', with: 'John'
page.has_css? '.success_alert', I18n.t(...)
Our POs also use git (non-technical != vcs illiterate) so we have a history of who wrote/modified what spec when.
Also, we have an additional simple layer of abstraction between feature steps and the actual tests. Most of our step definitions are three line methods that make calls to a ui driver that knows the details of our ui; or to a "given" driver that knows the details of our database/models. This ensures that there is only one place where anyone will implement, say, a user logging in.
Given(/^I am a logged in user$/) do
Given(/^I have signed in$/) do
We generally only write golden path features. Edge cases are handled with lower level testing.
Our feature files stick to specifying business value rather than ui implementation. For example, we don't do this:
Given I visit the homepage
When I enter "children's bikes" in the search field
And I hit the submit button
Then I am on the results page
And I see a fieldset with a legend "Children's Bikes"
And I see a table containing "Huffy Sprite"
Given I search for a specific product category
Then I see the search term I used
And the results of the search
There is additional infrastructure and PO training involved, but overall, I'd say cucumber improves clarity and communication on our team.
As someone who recently started using cucumber, this was definitely a frustration at first. But after a couple weeks of writing step definitions, and focusing on making them reusable (using some of the tips here: http://coryschires.com/ten-tips-for-writing-better-cucumber-... ), I ended up with a pretty good bank of general steps that I could cobble together into new features without much modification. The result was something much more readable and reusable than if I wasn't using Gherkin on top of the step definitions.
I now tend towards minimal step re-use, lots of one or two-line steps calling into nice clean plain ruby classes which actually do the work of my steps.
I've blogged a lot about how to use cucumber well here:
We use cucumber extensively, our cucumber suite is not really that large, and with each new feature it becomes harder to maintain. There has to be some middle ground out there.
It's no more burden to write step definitions than it is to write RSpec directly. I use SOLID OOP to write my tests, most of the logic lives in regular old methods, and my steps look like:
Then 'the current subscription payment date should be tomorrow' do
verify_next_payment_date(Date.today + 1)
Well even for a developer of 15 years even the most obtuse English can be more easily absorbed than the cleanest code.
The speed of parsing and understanding code is directly related to the amount of time spent writing it and how recently it was read.
English can be read and understood at the same speed every time.
There is also a non-linear relationship between your feature file size and the underlying code they represent, especially if they represent integration testing. A single feature could test code from 10 different files.
For a single developer in an codebase they know and understand well and have recently worked on, Cucumber can act as friction on development.
For a single developer who hasn't worked on a codebase before, or who is returning to a codebase after some time Cuke steps will definitely improve their ability to understand what the code is doing and by providing a regression test suite improve the quality and probably speed of their output.
Code is a language of instruction - not communication, documentation can help, but it is easily forgotten and fails at communicating integration.
In my experience the ability to just plug and play steps and quickly add additional coverage for edge cases is a huge bonus for test automation.
But again you have to actually engineer things and write reusable steps and intelligent enough test frameworks that you don't need to often dig through the test code.
I think this issue is somewhat similar to how when writing unit tests you'll want to build up a library of reusable custom asserts targeting your domain so you can quickly exercise your sut and validate things. It's definitely an upfront costs to build things up like that but in the long run it pays off very nicely when you avoid taking on too much test debt.
So tests here are just cucumber or rspec tests that verify the underlying frameworks that cucumber drives. So for example if you have partial object equivalency methods that verify that only specified fields on two objects match you would write tests to verify they are correctly matching or failing to match two objects as expected before you use that in your cucumber tests.
The second claim against cucumber / gherkin is not it's fault, it's the toolset's fault. Using grep to find your step definition is not very effective.
If you use a tool like Rubymine, you can jump directly to the applicable step definition through one key combination.
In practice, you'll find grep is in the toolkit of far more Ruby programmers than the IDE Rubymine. And with good reason!
The real problem is step definitions with regex in them. You don't need to do this. Steak was written to avoid this problem: https://github.com/cavalle/steak
But the underlying issue described in this post is still valid - Cucumber is an unnecessary layer because it adds no value that you don't get with Capybara and your favourite testing framework.
The fact of the matter is grep is a poor tool to try to find your step definitions. There are tools out there that are much better for such things, vim with http://www.vim.org/scripts/script.php?script_id=2973, or an ide or <your favorite tool>.
You could make the claim that any abstraction is unnecessary if you see no value in it. It doesn't mean others will not find value in it.
Personally, I find it quite easy to understand and convey to non-devs the stories told through gherkin. That is part of the value prop for me.
If no one is reading them but the devs, I'd rather use something other than gherkin.
There is something deep and surprising about this fact.
The whole 'customer writes features' thing never worked for me either, but just having it as automated specifications for my projects makes it very valuable for me.
I no longer see the added value of Cucumber when you work in a project that has no non-technical team members writing tests.
I can read it to them, or we could collaborate on a user story that doesn't get directly mapped to tests, but then we have room for the interpretation/translation process to create the wrong assumptions in the test code.
There are plenty of tests I'd never show the customer because they involve implementation details, and those aren't going to be written in Gherkin because I don't hate myself enough to juggle that many regexps.
Of course I got into the same spot that almost everyone gets into doing that: too many "and then I click..." scenarios. Great coverage, but then the requirements change substantially and the test base is just a massive snarl of interdependent assumptions. At one point I had effectively written, in cucumber, a natural language interface to my app.
But then at some point I needed to move and I needed to move fast and I just let the whole infrastructure rot. Now I'm doing almost everything on the client side anyway, so it's all a little moot.
All of that being said, I'm still a Cucumber fan, but I approach it differently. Here's what I think:
You should have cukes for the 5-10 things that if they don't work make your product completely pointless.
On Gmail, there would be cukes like:
When I receive the email from Fred about dinner
And I open it
Then I should see the message
When I reply
Then Fred should get my reply
When /^I reply$/ do
fill_in "message", :with => "I'll be there!"
* Use cucumber as a way to force you to keep track of the handful of absolutely critical integrated experiences
* Don't use fancy grammars... Think of steps as human-readable function names.
* Don't do anything fancy in the steps. They should just be a list of actions that users would have to do.
And yeah, there's no reason you couldn't do all of this in Steak. I think Cucumber is just nice in that it encourages you to write it out in human terms.
What are the costs of having to understand and maintain code written using a seperate little technology stack (cucumber) for just 5 or 10 scenarios?
Okay, you sort of answer that: " I think Cucumber is just nice in that it encourages you to write it out in human terms."
Okay. If it's worth the cost to you to do this. You could also, of course, just write things out in human terms in comments above the 5 or 10 tests written in rspec or whatever.
It seems like it always times out waiting for something on the page to load.
It'd be awesome if people started tagging these as [technical] or [computers] or [vegetable-afficionados].
That is also where I draw the line. The whole argument that other non-technical folks can read Gherkin may be true, but I can't imagine them enjoying it.
It's like reading computer-speak. Why can't features just be described in English...?
But, yes, just yes.
(I'd go even further and say "why rspec when you can test::unit/minitest? especially for Rails since the minitest path is officially supported by rails." Again, the extra layer of stuff is just more stuff to understand and troubleshoot when your tests aren't working as you'd like -- for unclear benefit.)
No one forces people to use them, yet they seem to have a great time crying about it without trying to defend their position.
Cucumber and RSpec are among a set of testing tools which has really changed the testing landscape. It would be really wonderful if people spent half this time and energy on trying to create a better tool or joining the open source groups of Cucumber and RSpec and trying to make it better.
Well, other than, in many cases, employers.