Hacker News new | past | comments | ask | show | jobs | submit login
TDD your API (balancedpayments.com)
147 points by steveklabnik on May 16, 2014 | hide | past | web | favorite | 50 comments

It's really, really nice to see an API with all the validating JSON Schemas publically available[1]. I wish everybody did this.

We have been doing a lot of thinking at Snowplow [2] [3] about JSON Schema versioning and self-description. One of the key points as it relates to RESTful APIs is: it's not enough to version your API, you should be versioning the individual entities that your API returns; these entities should be able to individually evolve over time, with the focus on evolving the entities' schemas in additive (i.e. non-breaking ways) for existing clients.

There's quite a lot to learn from the Avro community in all this.

[1] https://github.com/balanced/balanced-api/tree/master/fixture... [2] http://snowplowanalytics.com/blog/2014/05/13/introducing-sch... [3] http://snowplowanalytics.com/blog/2014/05/15/introducing-sel...

The article is very interesting/good, but I do want to mention one issue I have with it:

“To say that there’s a large amount of literature on the benefits of this approach would be an understatement.”

I've brought this up before, but “literature” carries the connotation of “scientific literature”, and I actually haven't heard of many rigorous, well-constructed scientific experiments that have produced conclusive scientific literature that indicates the benefits of TDD. There's certainly a lot of anecdotal posts and writings, but there seems to be relatively little actual “literature” in the most commonly used sense.

Not saying TDD is bad, of course… Just wondering if this is in fact an overstatement rather than an understatement. If we're going to appeal to authority, we should make sure that authority is valid :)

"Making Software" [1] has a chapter called "How Effective is Test-Driven Development?" where they do a meta review of 32 clinical studies on TDD pulled from 325 reports. They filtered out reports based on scientific rigour, completeness, overlap, subjectivity etc, and ended up with 22 reports based on 32 unique trials. "TDD effectiveness" was defined as improvement across four dimensions - internal code quality, external system quality, team productivity, and test quality - and each dimension is strictly defined in sufficient detail.

They report that high-rigour studies show TDD has no clear effect on internal code quality, external system quality, or team productivity. While some studies report TDD has a positive impact, just as many report it has a negative impact or makes difference at all. The only dimension that seems to be improved is "test quality" - that is, test density and test coverage - and the researches note that the difference was not as great as they had expected.

The chapter concludes that despite mixed results, they still recommend trialling TDD for your team as it may solve some problems. However it is important to be mindful that there is yet no conclusive evidence that it works as consistently or as effectively as anecdotes from happy practitioners would suggest.

The meta-study is relatively short and can be read online [2].

[1] http://www.amazon.com/Making-Software-Really-Works-Believe/d...

[2] http://hakanerdogmus.net/weblog/wp-content/uploads/tdd-sr-bo...

Coincidentally, I am currently reading that book, but had not gotten to that part yet. Thanks!

The interesting thing to me about TDD is it almost feels like science: make a test, specifically a test of your assumptions (you assume the test will fail and you assume that the code you will create will fix it), test it, then modify from there.

Development is creative, but answering the questions "does my program work?" and "why is my program not working?" always feels like science.

> I actually haven't heard of many rigorous, well-constructed scientific experiments that have produced conclusive scientific literature that indicates the benefits of TDD.

I was thinking specifically of things like the http://research.microsoft.com/en-us/groups/ese/nagappan_tdd.... when I wrote that sentence, but I guess I also consider books as part of what I'm saying here.

The whole of the industry suffers from this. I know they are vastly dissimilar as software doesn't have the issues that tangible real-world stuff has, but you'd never see someone proposing a bridge being built on the principals read in some book or a paper that doesn't compare any other theory. I dread the day we have rigorous IT standards as much as I can't wait for it. I've heard it is possible with Ada and lots of concern, money, and talent, but I wonder if we'll ever see that elsewhere. I guess as a whole the tools we rely upon are getting better, but I'm not convinced the methodologies haven't been just more perpetual motion machines.

> “literature” carries the connotation of “scientific literature”

> there seems to be relatively little actual “literature” in the most commonly used sense.

Literature: books, articles, etc., about a particular subject

I don't think you should add words to what people have said to make a point unless you get some clarification from the author that the additions actually clarify what they said. There have been quite a few books and articles about TDD.

In industries such as rail and transport, industrial systems software - developed under Waterfall principles - tend towards the TDD approach, and while there may not be much academic/scientific literature on the subject, TDD is fairly well observed and applied in software industry. Because its also a 'natural act' in many other industries; you plan to fail once, improve, then pass the process, no matter if you're booting up an OS, or indeed a furnace, an accelerator, etc. Test before use, is TDD in a tl;dr, where 'use' to a developer means 'give to customer'.

Just try it for a while. If it suits you, then keep it in your bag of tools. If you wait for scientific researches to tell you what to do to get better, you will miss on a lot of learning opportunities. - My two cents

Engineering & anecdotes usually preclude science.

We created some nice catapults before we "understood" gravity. Even now we don't fully understand existence.

I don't understand how anyone can develop software with myriad moving pieces without isolating and testing each component as it is written. It's like trying to figure out why your car won't start by sitting in the driver's seat and pushing buttons. I wouldn't ever presume to tell someone else how to do their job, but I can't imagine doing it any other way. (I prefer to write out my spec in BDD style, and unit test when the logic gets hairy).

I work at a development firm that mostly services local corporations, and the average time we have to maintain the software we build is fairly low. My boss and I both appreciate the TDD philosophy, but when client budget constraints are what they are, TDD is one of the first things to go out the window. Our CEO has (understandably) a hard time selling a project for 20% more money when the client won't get 20% more features. Additionally, if/when bugs come up, we're able to sell support and enhancement sprints. So really, TDD would reduce the (already small) revenue stream we get from support, at the benefit of being "one of the cool kids".

Games have such demanding architectures that I can forgive them for not testing, as writing tests for code that's not written for it is very hard.

>I've brought this up before, but “literature” carries the connotation of “scientific literature”

to you. That connotation didn't even cross my mind. Regardless. Are you going to wait until there is a peer-reviewed scientific paper telling you TDD is good before you'll believe it? Do you have personal experience that TDD seems to make your designs better? If so, is that less true to you because we haven't scientifically "proven" it?

The trend "Jabberwocky" comes at a very considerable startup cost, a significant learning curve, and a negative impact productivity (at least during the early days). In return for strict adherence to its philosophy, it promises to possibly help you create a better designed and more reliable system over the long run. Some cool cats are using Jabberwocky, many more denounce it. Some studies claim it's at least not harmful, others say it's a bureaucratic add-on.

Meanwhile, you mostly deal with small, simple, or short-lived systems and have little expectation in building skyscraper enterprise apps, so the promised benefits fall in the "oh that's nice" bucket. Or perhaps you lead a team that is deeply invested in a three-year codebase not using Jabberwocky, and the cost of getting everyone to switch has a serious financial implication.

The stance of "Jabberwocky sounds nice enough, but I do wonder if anyone systematically proved it is worthwhile" is a perfectly valid position. It is not about what you should do, it is about what you can do, and you can't just follow every trend (also see: Agile+XP) or even trial every alternative. For many non-engineering firms, there is "real work" to be done, and the never-ending increasingly expensive pursuit of operational excellence can be a hard thing to sell internally.

This is a huge benefit of science of the rest of us - a few million people can experiment on the edge of what is known, most of them will get nowhere, but a few thousand will determine that "X448YAB" tends to be superior to "X28YNB". Over time, the rest 7 Billion of us will slowly migrate to the more effective way of doing things and everyone benefits. We cannot denounce someone for not using stuff from the cutting edge until that thing is proven effective.

It is indeed less true, until the possibility that we could be committing a type I error can be ruled out.

It may be that TDD has little negative effect upon development and results in a FeelGood™ response that could be present within any project with non-zero productivity.

The closest thing I've seen to a proper study concludes that testing is high cost and low benefit compared to other QA measures: https://kev.inburke.com/kevin/the-best-ways-to-find-bugs-in-...

What I really like about the author's approach is that it really answers the first question: "How can we validate that our API is working as intended?"

Writing your tests from the client's perspective actually gives you that certainty. TDD is a powerful tool, but in my experience its evangelists get into overly academic debates about exactly what a "unit" is and how strictly you should adhere to the red-green-refactor steps.

For a really thoughtful and in-depth approach to API TDD I recommend Growing Object-Oriented Software, Guided by Tests by Steve Freeman (http://www.amazon.com/Growing-Object-Oriented-Software-Guide...).

It feels so good to see some of the comments here and elsewhere on our approach at Balanced. I've been trying to put this process/system in place for close to two years now. Steve was finally able to make it possible where I couldn't.

There's a lot of other companies facing the same challenge and have also come to the conclusion that there isn't a sufficient solution pre-built. Hopefully, we'll get to the point where there's a standard solution that everyone can use instead of tons of homegrown solutions.

So I recently created a simple RESTful API. Nothing fancy here in 90% of the resources. 10% do some funky things (BTW, which HTTP verb should /login use? POST? You are not creating anything? PUT? PATCH?! Really it should be a GET in this case as I am just retrieving a pre-generated token...). I did add a set of tests to keep me honest about what the API should do. I added them after the API was completed, and based on the assumptions I would be making when working on the client-side part of the app. I figured the tests wouldn't show anything: I just wrote the API so any quirks/weirdness should be carried over to the tests. Boy was I wrong. Several subtle but important bugs did immediately appear. It is tough to test a sufficiently complex read/write API as mocked objects get more and more complex, but for this relatively simple case it was worth it from the ROI point of view.

I recently created a login flow for a RESTful API. The solution I went with was instead of thinking of it as a high level activity such as login, what I was really doing was creating auth tokens to then be used in the Authorization header in subsequent requests.

I created a /tokens endpoint, where I POST the auth credentials and in return I get back a newly generated auth token. In my opinion this is a nice RESTFul solution.

This is actually exactly how we model API tokens: https://docs.balancedpayments.com/1.1/api/api-keys/#create-a...

> You are not creating anything?

You are creating a session. That's why login could be a POST to /users/sesssions

And depending on your architecture, a user may decide to create multiple sessions to log in via different computers. Each of which can be audited within the application and remotely logged out.

(Balanced employee)

I love TDD. Not because I write a lot of tests, but, because it helps my mind reason about proper composition and develop modular and clean APIs.

Using it as a design principle creates better software and you get a nice bonus -- regression testing for your assumptions.

It also serves as one vector of documentation -- though it's not the best.

I'm a big believer in testing APIs. As mentioned in the post, these tests can serve a similar purpose as integration tests, and can double as documentation and API health checks.

I've been building a tool that makes it easier to work with HTTP resources. While it's definitely not finished, I've put it online temporarily if anybody wants to check it out:


It lets you define environments, http resources, and tests to be run against those resources. Custom headers, params, and persistent variables (to be used in subsequent tests) is all in there (although there is no help or handholding right now). There is also a command line tool that lets you connect your local sever to the service, and run the tests against your local environment.

Please excuse the un-styled login/signup screen, I added it just now to put the app online. Also.. this is a very early preview that I did not intend on putting online so early, there will be bugs. I'll be taking it offline this weekend.

This is kinda neat for being able to show some API workflows (like the PATCH update of a document and then GET request to retrieve it), but if you are more interested / content with one-by-one testing of resources, you can use the API Blueprint project (http://apiblueprint.org) to document your API, and then use Dredd (https://github.com/apiaryio/dredd) to test a running version of your API. You can then tie in with their nice visualization tools or upload to apiary.io and take care of your customer-facing docs while you're at it.

We explicitly looked at Apiary during this process, but it's lack of hypermedia support was a deal breaker, as you mention.

I had a similar goal, and built a simple shell script[1] that would enable its user to write tests as curl requests.

[1]: http://mattneary.com/Quizzical/

The challenge we had with that approach was being able to reference the response of a previous request.

As a simple example:

1. Create customer

2. Add tokenized card to customer

3. Charge the card

Step 2 requires the HREF/ID of the customer. Step 3 requires the HREF/ID of the card.

Yeah, I realized the same thing; this aspect really makes it harder to keep the system simple, naturally. I've thought about different ways of approaching this, considering chained requests as a possibility.

Here's an example of how we do it:

The scenario: https://github.com/balanced/balanced-api/blob/master/feature...:

  Scenario: Push money to an existing debit card
    Given I have sufficient funds in my marketplace
    And I have a tokenized debit card
    When I POST to /cards/:debit_card_id/credits with the JSON API body:
        "credits": [{
          "amount": 1234
    Then I should get a 201 Created status code
    And the response is valid according to the "credits" schema
    And the fields on this credit match:
        "status": "succeeded"
    And the credit was successfully created
There's several things in there like "And I have a tokenized debit card" that setup the scenario to be able to do things like "/cards/:debit_card_id/credits", which refers to an actual card ID created in the test API.

That's really quite impressive! I hadn't realized that you meant that an endpoint can be referenced "in the abstract" without explicit parameters, which makes for much easier integration.

For some reason, though, I feel like the natural language would actually intimidate me. Where are ideas like "sufficient funds" defined?

"I have sufficient funds in my marketplace" is defined here: https://github.com/balanced/balanced-api/blob/master/feature...

  Given(/^I have sufficient funds in my marketplace$/) do
    step 'I have tokenized a card'
    @client.post("/cards/#{@card_id}/debits", {
                   amount: 500000
Here's another example from the scenario above:

  Given(/^I have a tokenized debit card$/) do
        name: "Johannes Bach",
        number: "4342561111111118",
        expiration_month: "05",
        expiration_year: "2015"
    @debit_card_id = @client['cards']['id']
    @client.add_hydrate(:debit_card_id, @debit_card_id)

This looks interesting! Might be worth taking a look at the library we have built to validate the HTTP messages using `curl-trace-parser` (https://github.com/apiaryio/gavel)

Thanks, I will definitely look into that.

I'm not sure if it's legit or not to show off your work on HN, but I recently made a VERY small ruby DSL to test RESTful JSON services.

We started having a LOT of trouble with 3rd party services breaking their JSON contracts, so I started using this to test on our Jenkins CI every night to make sure their JSON was constructed properly.


It also helps our front-end/mobile developers make sure the backend devs are creating proper JSON based on specifications.

I struggle with this all the time. Data driven tests are awesome because you write so little code. But the code is often abstract and the data is long and hard to read.

I have never been a big fan of Cucumber either and I'm not convinced its more readable -- isn't it just more verbose? Interesting point about it being language agnostic though.

I guess I don't have a better solution to contribute. Just wanted to say yeah I have that problem too.

I also see Cucumber as a way of gluing specifications with implementation allowing you to regression test specifications.

You can end up with a very domain specific testing "language" (DSTL?) enabling developers and domain experts to start describing a lot of different behavior (specifications) to see if the current implementation supports the behavior.

If current implementation does not support the behavior, then you now have a specification to implement the behavior.

I've used tests similarly when designing APIs, partly to subject myself to my API's UX flaws.

I write mostly functional/integration tests[1]. I try to avoid needing unit tests by making mistakes/errors of that sort impossible by construction (types).

Good post.

[1] https://github.com/bitemyapp/bloodhound/blob/master/tests/te...

This is great, totally see the value for an API. I'm interested to know what people use for say, native desktop apps, which is what I'm building.I ask because I've tried writing a few tests a couple of years ago, didn't see the point and haven't written any since. My code is stable, it's dogfooded every single day. Anyone have any thoughts on TDD in this kind of scenario?

I find that in order to get natural feedback early in a development cycle, you need to write so many interdependent bits of code to get something actually working that any mistakes you made along the way are buried in a mountain of spatial complexity. By testing at a lowish level along the way, you don't have to wait until there's a button to press to see if things are working as expected. Long feedback cycles make for tedious debugging sessions. I consider testing more useful as an aid for authoring code than as proof that it works after the fact, but that's a nice side effect. It also forces good separation of concerns (monolithic procedures are difficult to isolate). I don't worry about 100% code coverage; I test what I need to test to feel confident my code is working.

Edit: I also write a lot of throw-away tests as I go. I find that the easiest way to figure out how third party libraries work is to write tests against the documentation. If you find something unexpected that way, you can be pretty sure it's somebody else's fault.

Interesting. We don't do much agile, and do a LOT of up-front design. Mostly out of necessity (you can't create secure software using scrum, kanban or XP). Defining the architecture before we start is where we do what you describe as doing with tests.

If you think about it, tests themselves are really client code for your class interfaces. So TDDing an API is just carrying that metaphor over to services.

> TDDing an API...

Yes and no. Depends how you "write" those tests. In this case, where you describe the scenarios, it is indeed true. However you can take a different approach – without writing explicit tests (and thus the client code). Rather you just describe your API in a sort of contract and then, as you iterate, you verify the implementation is living up to this contract...

Funny. I read this title, and thought "Oh, like Balanced Payments does", and they I read the brackets :)

Specification by example. +1

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact