Hacker News new | comments | show | ask | jobs | submit login
Are natural language specifications useful? (alastairreid.github.io)
39 points by ingve on Aug 21, 2017 | hide | past | web | favorite | 13 comments



I've never found natural language interfaces to be useful in programming.

The hard part of programming is specifying the problem in an unambiguous way.

If anything natural languages can make this harder to accomplish.


“The act of describing a program in unambiguous detail and the act of programming are one and the same.” - Kevlin Henney

While I mostly agree with this quote, please note that it doesn't mean our formal programming languages can't be improved to feel more "natural" (actually, we've been doing this for decades).


BDD test definitions are the one that really annoy me. As far as I can tell, they're written in "natural language" (i.e. a blessed set of specific parameterizable phrases) purely so that they can be written by PMs rather than devs. If that has ever actually happened in the entire history of anything, I haven't witnessed it.


I find utility in them being understood by PMs, even if not written.

Also, the overly-parameterized kinds of reusable statements like "Click on 'whatever' then fill in 'name' with 'stuff'" are an anti-pattern (they describe what not why, and they fail to provide actual reuse like a page object would provide so you get costly maintenance). They should read "When I log in" or "When I register as a user" they should interact with high-level objects provided in your test suite,they should be purely whole-system-user-interface integration tests, and you should have very few of them, just enough to answer "At the human level, what does this thing do for users, and why"

Any tool can be misapplied, poison is in the dosage, etc.


>They should read "When I log in" or "When I register as a user"

I hate those kinds of stories. Half of the relevant scenario information is concealed in a mess of turing complete code. Which user logged in? How did you register as a user?

I've read tons of stories like this and they're often next to useless. You might as well just write the test in regular code and put a comment at the top of the file and get the PM to read that instead.

If it's something the user does I don't want the executable spec to bury those details in turing complete code because writing it in the higher level language is "hard". I want them surfaced in the readable scenario.

The fact that Cucumber makes it difficult to write readable, terse, deduplicated, parameterized user stories is, IMHO, a problem with cucumber, not a problem with the people who write it.


If it matters what the user's username was or what button they clicked or what form they're on --- to the business --- then sure, describe those in natural language specs. If they only matter to the implementation, they should be in the implementation's integration spec suite (which, discussing technical detail, is not likely to be consumed or produced by business, and so has nothing to gain from being written in natural-language, can simply have natural language comments as you mention.

You shouldn't e.g. try to get to any coverage level with only a NL story suite, etc., that is a gross misapplication.

NL Story suite has a very simple utility: the customer can read them and say "yeah, this is an example of what should happen" or go "no, that's not quite what I meant" early on, before implementation. Being able to automate testing of that document as validated by the customer is just a bonus. 90% of the value happens before any code is written. If you are writing an executable natural language story and have no customers to read it, or if you are covering details they will glaze over while reading because they can't imagine or validate what is described, then you're using the wrong tool.

edit: Even a suite in code, for technical users, should strive to have more reuse than having each test describe exactly what actions are taken, since that's too hard to maintain. If your tests aren't DRY, you spend too much time updating the 50 places your login form selectors are reflected in the suite and the test suite acts as scar tissue that prevents changes instead of a protective but flexible skin and skeleton that helps the program adapt.

Over-using story tests and dealing with the maintenance burden / poor abstractions, overly-repetitive integration suites that make changes hard and inappropriate use of unit testing that makes refactoring difficult are all really common traps when people don't get the big picture of test automation, but they are entirely avoidable. A test suite is like any other piece of software and has to be designed (and factored) for the reality that its behavior will change.


>If it matters what the user's username was or what button they clicked or what form they're on --- to the business --- then sure, describe those in natural language specs

IME it's usually left up to the test programmer what to put in them and they often put vague stuff in there that looks exactly like what you just wrote and that becomes useless both for PMs and non-PMs reading the natural language suite because it leaves out business critical details.

It's also a hack often done in order to keep the story suite short because (as I mention below), no inheritance in cucumber. There's a reason why the whole world has not yet flocked to BDD and I don't think it's an issue with BDD itself (I'm a big fan).

>NL Story suite has a very simple utility: the customer can read them and say "yeah, this is an example of what should happen" or go "no, that's not quite what I meant" early on

I think the idea that it has to be natural language so that a customer can read them is bullshit. This is the exact same mistake the creators of COBOL made, thinking that natural language naturally elucidates things. It doesn't. It's parses very ambiguously. That's a feature if you're flirting with a girl perhaps, but a bug if you're trying to write a precise executable specification.

I feel very strongly that the story suite should be written in a language that is easy to parse, can handle parameterization and inheritance but isn't turing complete. It ought to be readable for PMs and still maintainable as part of an integration test suite.

>If you are writing an executable natural language story and have no customers to read it, or if you are covering details they will glaze over while reading because they can't imagine or validate what is described, then you're using the wrong tool.

If they glaze over that might just be because they're a bad PM. I've had PMs that glaze over when trying to figure out business-critical edge cases with them because they liked to think of themselves as "big picture guys". That's fine, they just shouldn't be PMs.

I think that the divide shouldn't be between "important to business" and "not important to business" but simply "test implementation" vs "specification".

>edit: Even a suite in code, for technical users, should strive to have more reuse than having each test describe exactly what actions are taken, since that's too hard to maintain. If your tests aren't DRY, you spend too much time updating the 50 places your login form selectors are reflected in the suite and the test suite acts as scar tissue that prevents changes instead of a protective but flexible skin and skeleton that helps the program adapt.

This is actually the main reason why I think PMs typically shouldn't write executable specs. Inexperienced programmers often don't yet have the DRY instinct and the ability to keep a strict separation between implementation details and specification. It's a rare PM that has those skills.

Then again, maybe they just need to be trained. I've also had the problem of massive headache inducing repetition and a blurred distinction between implementation detail and specification in huge word document specs.

>Over-using story tests and dealing with the maintenance burden / poor abstractions, overly-repetitive integration suites that make changes hard and inappropriate use of unit testing that makes refactoring difficult are all really common traps when people don't get the big picture of test automation

Yeah, well, repetitive code and poor abstractions are basically a problem writing code in any language. Poor tools make that worse, however.

It's one of the reasons why cucumber is a pile of shit: no inheritance. Most stories in a business app are actually forks off existing stories. It's utterly inexcusable that it doesn't have this feature.

Integration test suites with high coverage and readable stories do not have to be repetitive. Mine aren't.


I think we're in agreement then.

It's indeed nice to have a PM that could produce details that can be synthesized into a technical spec. It's nice when the PM can work with the customer to actually understand those details. It's nice when PMs can actually care about their products, and work with devs to weigh options. I don't usually have those PMs. Lots of organizations have JIRA Babysitters who show up at your desk whenever the political climate changes to let you know that you should drop the urgent thing he asked you to do yesterday, because there's an urgent thing he's got for you today.

The customer though, always wants what they want, and you can carrot instead of stick them, and I have found some cases where I can sit with the customer and do BA/PM with them despite the people with those roles, explain the idea of test automation, show them how it can drive the browser and ask them to describe the application (at a very high level) in natural language in terms of examples, which I translate into Given/When/Then as we are working. I can then explain in status updates "so, the 'simple' scenario for 'user submits an order' is implemented <point to CI output> but we don't yet pass the story for when a coupon code is used. Should we prioritize coupon code scenario or a different feature?" or "So, we have implemented this story according to the examples we were given in May, the behavior you described on this morning's call would read "Given..." instead of "Given...". I can update the story and prioritize that ahead of <whatever>, or should we ship with the original behavior and proceed with <whatever>?"

Is it my job or the customer's job to have to do that? Probably not. But it's a way to build a bridge from concrete stupid machines land into fuzzy people land where everything is negotiable, in the absence of the roles or skills to do so without such a tool.

Cucumber is indeed poor, the simple 1:1 of step definition strings to their functions and of 1:1 scenario blocks to test executions are unfortunate.

I'd like to see a natural language tool where I am specifying invariants to a property-based testing tool. "Given a user" implementation = what does a user mean, well, it has a unicode string 'name' etc. and those map to database model as follows, "when I am on the login page" here are the five pages that have login forms, "When I enter my username" find a field 'username' and fill it with the generated user name ... "Then I should be logged in" logged in users see log out, their username in the corner, have access to their models, logged out users do not, etc. Now this blows up into all the combinatorial options, I find out "hey, if a user has emoji in their password, several invariants are violated."

Generally I do not use NL tools or recommend they be used, I'd definitely prefer the environment where they are superfluous as there is a QA automation engineer working with a PM that is willing and able to elicit the necessary details from the customer. Worse yet, all the current implementations seem to be sorely lacking. Just, I don't think all hypothetical NL tools are categorically useless.


We started with everything in natural on our products, but rapidly I found myself wanting something more formal.

I have been working on formal specs for our products in Coq in the past but it took me too much time and outside solely software dev it got me stuck.

So lately I have been using TLA+ (I used it many years ago but did not find it formal enough in my then naive age and experience) and I must say it is great.

The learning curve is quite steep but not as steep as Idris or Coq (also less formal) and far more practical.

Think the author could have used TLA+ although I did not get a full appreciation about his executable specs from that article.


Can you share more about what kind of programs you're specifying with TLA+? Business applications, particular algorithms, or something else entirely?


Our firmware, encryption algorithms, app/web and deployments. I currently finished the firmware and deployments and working on the rest now. The firmware helped a lot as we cannot change that in the field so we need to agree on a non ambiguous way of communicating the specs for it.


Answering to the main point of the author:

Architectural intent can be expressed in a formal way, but it requires a formal language that allows you to define new abstractions.

And designing such a language is way harder than defining a small DSL with just enough features to formally express your specification.


Yes, that is a large part of what I was saying.

Also, some things are so hard to specify formally that we still don't know have any kind of formal spec. Memory concurrency semantics is an example. It is only in the last couple of years that we got a good spec of fixed size memory accesses. Then Peter Sewell's group drops the bombshell that if you have mixed size memory accesses then you can't make programs sequentially consistent even if you add a memory barrier after every single memory access. But we still don't know how to formally specify the memory orderings associated with atomic accesses, instruction fetches, page table walks or device accesses. So, until then, we use the best natural language definition we can and hope we will be able to formalise it soon.

Also, there are parts of the natural language spec that I had not seen any value in... until I started worrying about whether the spec itself was correct or could possibly be shown to be correct. And now that I do worry about that, I am seeing new value in those parts.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: