
The Tragedy of Given-When-Then - wheresvic1
https://theitriskmanager.com/2019/04/06/the-tragedy-of-given-when-then/
======
pjc50
Whereas:

> The reduction of the tester to an expert translator/typist is a tragedy

> The Given-When-Then detracts from understanding and readability but provides
> the much prized automation through tools like Cucumber

> Instead of abstract domain models in Rational Rose

Now I see how we got into this situation. All this misery comes from the
people who understand the code and the people who understand the business
being different, non-intersecting groups of people.

Since the BAs can't read the code, there is a desire to construct a technical
representation that they _can_ read, representing a some kind of formal
specification, and then derive as much as possible the program from that. This
was the promise of Rational Rose and the whole UML project. Had it been
successful, it would have reduced programmers to stenographers, "coders" in a
very limited sense. As it is, in a "Rational" system of this kind, coders end
up producing increasingly elaborate polyfills between the machine-generated
parts of the system and the rest.

The target of this is always the "5GL" dream:

    
    
         - specify program in English
         - ???
         - automated tools transform to code
    

.. without programmers involved. Unfortunately we've not managed to reduce the
irreducible complexity of the step in the middle.

GWT does this de-skilling to _testers_ instead; it's effectively TDD with
tests in this quasi-human-readable format that gets unimaginatively translated
into programs by the testers.

The proposed solution is to do more communication through spreadsheet-
prototypes. Since these are effectively programs in Excel, with all the real
programming capabilities thereof, you get a real working model of the system
that actually behaves like a program.

Many businesses simplify this process by then just deploying the spreadsheet
to production.

(I'm not familiar with Cucumber, but it looks like a thing for constructing
toy "human readable" DSLs for tests? I wonder if we could get BAs to learn
INFORM?)

~~~
lugg
Im surprised you missed it.

Businesses solve this problem by hiring and training programmers in their
business model and domain.

Coders, are worthless. Coders with business acumen and domain knowledge are
invaluable.

This is also why I will never work with anyone with a job title of BA. They
are trying to do my job but can only do one half of it. They're a pointless
bottlekneck in the system.

And no, there is no place for BAs who can code only those who do. They're
called software developers.

At this point I don't know if software engineer, developer, programmer or
coder is really the right term any more.

I'm more of a systems analyst. The amount of time I spend coding is miniscule
and I don't think I'm alone in this.

~~~
borland
That's a bit inflammatory... My experience with BA's is that they're like
anyone else on a team. They are more skilled in certain areas, and they can
provide a lot of value if their skills are used appropriately.

I've found BA's are very useful in the following areas specifically

\- Communication with the market (either customers or in-market staff such as
sales/support) to gather feedback about what they'd want

\- Orchestrating decisions (asking all the dozen stakeholders what they want
to do and arriving at a sensible conclusing)

\- Arbitrating feature/UI decisions (people on the team aren't agreeing on
whether we should do something or not)

\- Documenting requirements (by this I mean the desired functionality that we
believe customers need/want)

\- Iterating and getting feedback on requirements

\- there's more, but off the top of my head that's all I can think of

Sure, I as a lead developer have business knowledge and can also do the job of
business analysis, but a dedicated BA will be able to spend more time doing
that - increasing their skills, and freeing up me to do other things. If I
take on all of the above tasks in a decent-sized project, it's going to suck
up a huge amount of my time - in many cases almost all of it - so then
basically I'm just a BA with a different job title. BA's are great if you
learn to understand the role and how it can help a team

~~~
lugg
All of those things are what a developer worth anything to a company should be
doing.

Similarly, BAs without business domain knowledge are even more worthless
because of this.

I.e. consultant BAs are, as most of us know, a waste of money.

BAs are struggling to find a language to describe their business requirements
in a way the programmer or computer can understand.

It's code. It's what programmers have been writing all along. That is the god
damn language they need.

Given when then? If then else.

Surprise!

~~~
kthejoker2
Ricardo strikes again. Relative comparative advantage says even if a developer
is a better coder and BA, they're better off spending time coding and letting
non coders be BAs to maximize overall throughput.

~~~
lugg
This assumes a BA is cheaper and more available than a programmer.

If they are cheaper, I guarantee you're going to have problems.

You're literally installing a weak link in the most crucial part of the chain.

~~~
kthejoker2
BAs are by definition more available than developers, as all developers are
BAs but not vice versa.

And given basic supply and demand they are cheaper.

Anyway read your Ricardo, you're right but wrong about the implications. Your
development is an even weaker link in the chain because good developers are
even rarer than good BAs.

~~~
dragonwriter
> BAs are by definition more available than developers, as all developers are
> BAs but not vice versa.

Plenty of developers are _not_ BAs. BAs are systems analysts with expertise in
requirements elicitation (basically, goal-directed interviewing) and
technically writing. Plenty of developers are neither systems analysts _nor_
skilled at interviewing or technical writing.

------
joshwa
The article seems to assume that it is, in fact, possible to completely and
correctly specify a system, and not only that, do so ahead of development.

This is a pipe dream in all but a minuscule slice of software projects.

GWT/cucumber is as decent a tool as any for creating automated tests of system
actions (and especially interactions) in a mostly-human-readable format that
is likely to be understood by BAs, testers, and devs, even if not all of them
are expected to be able to write them.

I've used them across many projects with great success, with the understanding
that it's not intended to replace unit tests for calculations, nor stories for
initial specification.

As endian says downthread[0]: ", G/W/T isn't the solution to domain
understanding, conversations are. You should be continually communicating to
between all stake holders to maintain a current domain understanding."

[0]
[https://news.ycombinator.com/item?id=19674013](https://news.ycombinator.com/item?id=19674013)

~~~
p1necone
It seems pretty obvious to me that any way to "completely and correctly
specify a system" is going to look a hell of a lot like a programming
language, and be no less complex or hard to understand than one... it's almost
a tautology.

------
pytester
>Although Given-When-Then is a fantastic way to describe interactions, state
and behaviour, it is a lousy way to describe data and calculations.

I follow a slightly different technique that I think makes given/when/then
also a useful tool for describing data and calculations.

1) Write the test as follows, using _deliberately simplified but still
realistic data_ (this is crucial):

>Given <a set of market data and trades in a table> >When <arbitrary event
such as calculation is performed> >Then <leave blank>

2) Write the code that outputs the data/calculation.

3) Run the test in a "rewrite" mode that fills in the results of Then based
upon actual output (this process is somewhat similar to golden master).

4) You now have a passing test and generated data which you can eyeball to see
if it is in line with what you would expect. This test with recorded output
can then be shown to the PO (or whomever) and committed to source control and
used for regression testing.

This obviously isn't possible with cucumber or other gherkiny tools and it
does require your processes to be fully deterministic (a laudable goal
anyway), but it works pretty well IMHO.

~~~
delusional
That's a pretty common technique. The problem I've found is that it risks
incorrect results be assumed correct. If I refactor your algorithm, and fix a
bug in the process, I'll be very confused if your tests start failing.

The benefit of calculating it yourself first is obvious, you get to cross
check your result. That tends to highlight minor errors such as -/\+ errors.

This is not a problem if stability is more important than correctness of
course, but that's not where I am.

------
endiangroup
Act 1 - G/W/T can be used to express iteratively aspects of an algorithm such
that you can derive it without knowing or fully understanding it (algorithm
triangulation: think GPS where each satellite is a constraint and you derive
through iteration of each passing scenario the general algorithm).

Act 2 - is really about process rather than G/W/T (which is really just AAA,
arrange, act & assert).

Act 3 - again process, G/W/T isn't the solution to domain understanding,
conversations are. You should be continually communicating to between all
stake holders to maintain a current domain understanding.

We wrote an article recently on the limits of BDD [1], G/W/T didn't really
come up, there are other more glaring issues with BDD when it comes to systems
that intersect mismatched understandings of the real world between experts and
users. Unrealistic wants and goals are killer. Additionally we started writing
a tool to attach metadata to scenarios (G/W/T) so you can capture technical
details about things called SpecStack [2]

[1] [https://endian.io/articles/limits-of-
bdd/](https://endian.io/articles/limits-of-bdd/) [2]
[https://github.com/endiangroup/specstack](https://github.com/endiangroup/specstack)

~~~
pcm191
Hi Endian

Interesting article. I would be interested to see if feature injection
("Working Backwards") would help you. I do not think the first of the problems
is a BDD problem, its more a "reality" issue. The only way to solve it is to
use financial derivatives and I suspect the lack of liquidity in the crypto
would make that prohibitively expensive.

Judging from your web-site, you are only 5 mins up the road from me. Would you
be interested in a lunchtime session to see if FI would help?

Chris

------
ryanmarsh
G/W/T is great at capturing the context/action/outcome of a test scenario in
English. It isn’t great for all cases though. Furthermore Gherkin could be
updated to allow more flexibility.

Having structured format for tests/requirements, in English (et. al.), can be
_incredibly_ helpful. I would love to see some innovation around helping
programmers and non-programmers reach a shared understanding of what the
system should do and the cases we will use to verify it.

I don’t think unstructured conversation, unstructured English, or Excel tables
are the solution. This is still an unsolved problem in our industry.

~~~
pcm191
Hi Ryan

How about structured excel sheets. Excel sheets with quide rails for business
people to structure the expression of their thoughts?

That's what I'm getting at.

Chris

------
60sec
The main problem with cucumber / GWD is that in most implementations it serves
as an opaque abstraction layer which is an incomplete/incorrect model
abstraction of the system itself.

Been doing a lot of API testing recently with karate dsl and writing cucumber
tests that include json expressions with some syntactical sugar for
validation. The tests serve as a specification for the system which is
actually quite a bit more precise than even swagger since you can even go back
in time and compare the deltas on request/response between test executions to
troubleshoot regressions.

Agree that GWT can't help business understand the inherently complexity of a
state machine, but individual tests can be used effectively to model state
transitions, especially at the api level.

------
Rooster61
I think a lot of the issue is the misconception that G/W/T feature files can
essentially replace specifications/requirements. They can and absolutely
should REFLECT the requirements, but they are ill suited to act as the actual
specifications themselves.

Feature files to me are most effective when they act as a roadmap of the steps
one needs to take to effectively test a given set of use cases, NOT a 1-1
carbon copy of the requirements. It should tell a non-programmer what the test
is doing without having to dive into the code, while being a scaffold to which
a programmer can build their test logic into. If one needs to look at the
requirements, one should do just that, read the requirements document, or in
the developer's case, read the requirements set forth in the user story.
Scenarios are guides to how to navigate what the test is doing, not what the
application itself should be doing.

Also, I often see programmers attempt to write a BDD test and run into a case
that doesn't quite fit flush into G/W/T, then ask the community of how they
might go about writing that test. Instead of understanding flexibility, they
are met with an abrupt "that's not BDD, you are doing it wrong, if you did it
the BDD way everything would work out". That's discouraging, frustrating, and
destructive. G/W/T is not gospel, and it doesn't fit all test cases. I see
nothing wrong with fudging some tests to not follow Gherkin to-the-letter if
it better facilitates a test while still remaining clear what the test is
doing in plain English within the feature file's scenario.

------
neves
I can't agree more. I'm still to see a testing tool that can be used for
specification and used by end users. Today everything is developer centric.
Maybe there's no escape to this. The solution is really to make your
developers understand the business.

~~~
endiangroup
Thats where things like Domain Driven Design are going, and really it makes a
lot of sense. Why express and model your domain in terms outside of it? Its
like forever living through a translator instead of just learning the language
in the first instance, you add more points of potential failure.

~~~
BoiledCabbage
Domain Drive Design is one of those things that seems so easy to write off as
"just another process" but looking closer it seems like it really is a solid
method for resolving issues of modeling, requirements, implementation and
communicstion on a project.

I expect to see is continue to organically grow in popularity as people try it
and adopt it. It's both obvious and insightful at the same time.

------
RHSeeger
> We will realise that describing data and calculations using the Given-When-
> Then format leads to tragedy, and will create and popularise tools and
> approaches using Excel to document examples.

Given all the examples given, it sounds more like Cucumber is the problem.
Having specifications/tests written in the form of given-when-then isn't shown
to have any issues. Rather, taking those specs/requirements and disconnecting
them from the people who need them is the issue.

~~~
thom
Is the complaint literally just about aligning the text in tables of a
Cucumber file because business analysts are more comfortable in Excel? Or just
managing the text of your test suite across a whole project? I am struggling
to parse out the underlying point of the article.

~~~
anentropic
What I understood is:

You are supposed to capture requirements in Given/When/Then form _before_
coding, and they should be written in a way that is independent of the
implementation

The complaint seems to be that what happens in practice is "the devs have
taken over" the system, so Cucumber just becomes a test automation software

This has led to writing tests in Given/When/Then format which test overly
specific elements of the implementation, and are often written _after the
fact_ , instead of as a way to capture requirements.

And some other stuff, but that's primarily what I got out of it.

~~~
tannhaeuser
> _the devs have taken over [authoring tests]_

That's what happened in the projects where I used fitnesse, jbehave, cucumber,
etc. The users couldn't understand the artifacts and what they're supposed to
do, and the tools were highly idiosyncratic. It ended in becoming just yet
another hassle for developers.

~~~
pcm191
Exactly.

Whereas using Excel for specification of calculations (by Business Users or
Business Analysts) actually brings the business and developers closer
together, removing the need for intermediaries. Business Users are very happy
using Excel and checking calculations etc. as it tends to be their home
environment.

------
rgoulter
_A common anti-pattern with Given-When-Then has the Three Amigoes
collaborating on scenarios that are stored as acceptance criteria in user
stories._

I agree with this. Writing Cucumber/Gherkin scenarios is extra effort if the
Cucumber files themselves aren't used/read elsewhere. -- It'd be simpler to
embed "Given/When/Then" statements within test code (like RSpec).

I'd emphasise that Gojko Adzic's "Specification By Example" suggests
discussing examples before refining to a specification; that may get around
the author's complaint that non-table formats don't allow for important cases.

That said, "Given/When/Then" is hardly magical, so doesn't deserve much
praise/criticism itself. Any test involves "do an action, check the result"
(with "setup the system" and "cleanup" being implied). Sometimes called
"Assemble, Act, Assert". "G/W/T" is just a neat, consistent format for
describing behaviour in English. A table of values specifies some computation;
the column titles help to describe that behaviour.

~~~
Dangeranger
You can embed G/W/T into RSpec. There's a library called Turnip[0], a play on
Cucumber, that extends the syntax to Feature tests in RSpec.

\- [0]
[https://github.com/jnicklas/turnip](https://github.com/jnicklas/turnip)

The reason RSpec didn't include this originally probably stems from the
library being designed for testing at the class or module level, rather than
an entire application.

------
dmitryminkovsky
This is why I like Spock [0]. You can go G/W/T/, W/T, T, or Expect [1]. It's
billed as "multi-paradigm" which really just means you can do whatever feels
right for a given case. Also its data table feature is wonderful [2][3].

[0]: [http://spockframework.org/](http://spockframework.org/)

[1]:
[http://spockframework.org/spock/docs/1.3/spock_primer.html#_...](http://spockframework.org/spock/docs/1.3/spock_primer.html#_blocks)

[2]:
[http://spockframework.org/spock/docs/1.3/data_driven_testing...](http://spockframework.org/spock/docs/1.3/data_driven_testing.html)

[3]:
[https://twitter.com/dminkovsky/status/1116727735399976966](https://twitter.com/dminkovsky/status/1116727735399976966)

~~~
vorg
> you can do whatever feels right for a given case

Too bad you can't use whatever specification language feels right. Spock only
provides Apache Groovy, with its tacky syntax hacks like long strings for
function names, block labels having special meanings based on their name, or
the OR and LOR operators being used for drawing tables in the code. In the
past when software has provided Groovy for writing specs, they eventually
provide an alternative when Groovy's shortfalls become obvious, e.g. Kotlin
for Gradle [1], or the Declarative Pipeline Syntax [2] for Jenkins.

[1]:
[https://docs.gradle.org/5.0/userguide/kotlin_dsl.html](https://docs.gradle.org/5.0/userguide/kotlin_dsl.html)

[2]: [https://jenkins.io/blog/2016/12/19/declarative-pipeline-
beta](https://jenkins.io/blog/2016/12/19/declarative-pipeline-beta)

~~~
zmmmmm
> Groovy, with its tacky syntax hacks like long strings for function names

Actually, Kotlin does that too. I think it is even encouraged for writing
tests.

------
raldi
This article would have been better if at some point it explained what Given-
When-Then is.

~~~
Chlorus
That would have gotten in the way of making wild generalizations about what
Developers & BAs do.

------
verisimilitudes
>Before the internet, user experience was considered of little value because
most users of systems were internal employees of companies.

I'm skeptical of this claim. It would help if there were a year, considering
it's not clear if the author means the modern Internet, the ARPANET, or when
the modern Internet became widely available to a larger group of people. Based
on the mentions of Excel, I'm inclined to believe it's the last option I
listed.

The MIT AI lab and other research areas come to mind as places that cared
about how the programs were operated and whatnot and these weren't exclusively
used by employees. SHRDLU comes to mind. While I'm thinking about it, the
Apple Macintosh also does.

I don't believe I was familiar with this Given-When-Then model beforehand, but
I also think that's because it's somewhat natural, or at least seems natural.
The author has failed to convince me why this is a bad thing. I suppose I can
see why these larger business practices are poor, but that has me failing to
see why this particular practice is singled out.

------
exelius
Yeah; I suspect the core of the problem is that too many software developers
fancy themselves business analysts while too many business analysts start to
cower in fear whenever you suggest they check something in to GitHub.

~~~
ryanmarsh
When I teach BDD classes I sometimes joke that we might not need acceptance
tests written in G/W/T if the biz people could read programmer tests.

------
projektfu
You might want to check out Fit as a testing framework that may be more
appropriate for your use case than Cucumber. Not sure if it's still well
maintained, but it could be brought up to speed without much effort.

------
noveltyaccount
To;Dr, given-when-then can sometimes obscure requirements rather than
illuminate them. Use Excel or other tools to document such scenarios, as
everyone in the software design process can understand Excel formulas.

~~~
thom
Why would moving a table from The Place All The Other Requirements Are Kept to
an Excel file stored elsewhere do anything other than obscure requirements?

------
jrochkind1
I think there's a lot of people down on cucumber after experience with it, in
several different contexts... what do you think, what has been your
experience?

~~~
Macha
The dream of having the requirements be the test is not realistic:

* Good luck telling your PM their English has a syntax error

* Once that's out the window and the cucumber files are just maintained by devs, is the regex translation layer worth it?

* A lot of implicit state also gets wrapped up in the test in making these phrases at least semi-readable, which makes for a pain debugging failures.

~~~
rhinoceraptor
You can also enter Cucumber-developer double jeopardy where you need to meet
with the PM for an hour to write Cucumber specs, and then meet with the QA for
an hour to implement them.

When you could have just written a simple integration test in 15 minutes, but
the Scrum master thought Cucumber would be a good idea.

------
philipodonnell
> The discussion helped me realise that Given-When-Then is as much of a
> hindrance in some contexts as it is a help in other contexts.

I like when authors express a strong viewpoint but then also include
descriptions of circumstances where their viewpoint may not be applicable.
This seems to be alluded in the above quote, but are there specific contexts
where Given-When-Then _is_ helpful and the appropriate mechanism to document
requirements?

~~~
rgoulter
I think rspec's Cucumber docs are fine. [https://relishapp.com/rspec/rspec-
expectations/v/3-8/docs/bu...](https://relishapp.com/rspec/rspec-
expectations/v/3-8/docs/built-in-matchers/equality-matchers)

It's an executable specification, but it's also in a format which is readable.
If not "documentation", then it's at least "small, verified examples".

I'd find a table of values less readable. I don't mind if a document like this
is the output of running the tests themselves, though. (Obviously, the
important details are "input program, expected output", which might not lead
you to use Cucumber, but I think it's a fine use of Cucumber).

------
xchip
That is pretty much the core of software engineering, I'd call it Data-If-Then
and I fail to see why this is a tragedy.

~~~
Sahhaese
I think that is the tragedy that the article covers. Described as a scenario
where the engineering has ended up squeezed into the business analysis and
there isn't good communication between there are the development which are
seen as implementers.

This doesn't feel familiar to me but I've tended to always work in legacy
systems where the opposite problem (no BA at all) is a more familiar problem.

------
barbecue_sauce
What ever happened to Systems Analysts?

------
jodrellblank
arcfide has claimed on several occasions that non-programmers take to APL
quite easily, and can read and collaborate on it with a programmer, be talked
through it directly, in a way they can't/won't do for mainstream languages.

I find this such an unlikely sounding claim that I want to reject it without
consideration, or at least assume that he's only talking to a very restricted
subset of engineering non-programmers.

APL is decades old and very much a business language by origin in IBM System
360, there ought to be decades of people's experience with this on both sides
to back it up or refute it - programmer and non-programmer. Is there?

------
macca321
It's a fallacy that GWT has to be at the browser automation level. And it's a
lot easier to understand the point of a test if it's got GWT comments
interspersed with the code.

