Hacker News new | comments | show | ask | jobs | submit login
Ask HN: Why don't you write functional tests?
13 points by hugs 1233 days ago | 11 comments
I'm curious why some developers don't write functional* tests for their apps. My theory is that really smart & small teams just starting out probably don't feel the need. But when they get bigger, and have real revenue, users, and reputation on the line, they get serious about end-to-end testing.

But am I wrong?

Why don't you write end-to-end tests?

* I'm making a distinction between unit tests (e.g. no db/disk access, no I/O) and functional tests (e.g. full end-to-end test including the full server and web browser or mobile app)

(disclosure: I created a functional testing tool and started a testing-focused company.)

My experience has been that functional tests are excellent and discovering that "something" is wrong, but less good at finding what is actually broken.

Say my "log in" test fails: OK, I know auth is broken... but what part of it? Is there some JavaScript hijacking the form submission? Is my login view busted? Do I have an error in my authorization system itself? Is the LDAP server I'm authorizing against down? Has someone changed the schema? Etc., etc., etc.

I do find functional tests to be a pretty valuable part of my testing toolkit, but unit tests seem to get me a bigger "bang for my buck", so functional tests tend to be something I don't spend a ton of time working on.

(I wish I did write more functional tests, but clearly they're not valuable enough for me to prioritize them. Otherwise I'd already be using 'em more.)


Thanks, Jacob. :-) Follow-up question: Do you do monitoring in production? Does it have the same issues in that it tells you that /something/ is broken, but doesn't tell you what's broken?


Most of my "unit tests" are functional by your criteria, at least as far as things like file I/O and database access are concerned. Maintain a reasonable separation between UI and the rest of your code, and this remains fairly straightforward even without any tool more fancy than xUnit and a collection of SQL scripts to automatically set up your test database. I've also found it easier in many cases to maintain a single parameterized test loop along with a library of sample data and corresponding "expected results" than lots of little tests in code. This is especially true in situations where you're changing code (and adding tests) directly in response to specific inputs your system failed to handle. And you can frequently re-use the same test loop on larger collections of sample data for other sorts of testing.

More comprehensive testing including multiple servers, client user interfaces, and so on, tends to be labor intensive to set up and maintain even with good tool support, so here I suspect most of the answers boil down to time and money.


I'm a big fan of unit & functional testing. One reason for not writing functional tests though is that they are way harder to write than unit tests (assuming that your unit tests are relatively straightforward because your code is well factored)

I always find starting functional tests, especially in a new area of your program, to be a tremendous exercise. Maybe I'm doing it wrong. For example, last week we moved on to writing code that hits an sftp server - so our functional test spins up an sftp server locally & overrides the product code settings so that it connects to that. Getting this set up and working took me a couple of days, and this is such a tiny aspect of the code under test that I'm afraid I'm exhausting my teammates patience for "test first functional tests".

In other areas e.g. hobbyist opengl code, I think this initial barrier to getting started is so high (you'd have to write a sort of machine vision thing to analyse the actual output, rather than just what opengl calls were made) that truly end-to-end tests are a complete nonstarter. You have to give in and settle for integration tests instead.

I suspect many people are in that same place with regard to functional testing of web apps. Yes there are tools like selenium, but there are issues of test data & making sure product code connects to test databases. There are solveable, but people don't see that the value is greater than the costs of wrestling with that.


"a sort of machine vision thing" -- You could use Sikuli or SimpleCV. I'm using OpenCV in my video game playing robot to find elements on the screen. (Of course, it's a bit of a Frankenstein to create and run.) At least it's good to know it's possible, though. But, yes, I can see why others would not want to go through this much work -- it is hard.


Hey. Out of interest, Jason, would you personally write a functional test that spun up a local SFTP server like I describe? As a developer who's better than average, it presumably wouldn't take you a couple of days to get that working, but even so, would you consider cheating a little and mocking out the sftp client calls? If so, would you still call it a 'functional test'?


for opengl testing:

If you just want to make sure that your changes aren't breaking something that is already working, you could have your code automatically create an image and then compare that image file with what you have verified as correct.

This assumes that the output for a scenario is deterministic and it requires that you hand-inspect new images if code is changed that will affect the output.


Hey. It's a nice idea, but there are problems with this. The actual output varies depending on your hardware & driver combo, and is substantially affected by the state of the graphics drivers (e.g. What types of anti-aliasing or interpolation are enabled? What color profiles are loaded?) The RGB pixel values on one machine will not match those on another machine.

You could imagine writing an 'is image almost equal' comparison, but I'm informed by those who have tried (pyglet developers) that this is substantially harder than it sounds - the differences between images are not what you would expect.

The alternative, if you want anyone else to be able to run your tests, is to tie yourself to a particular OS/hardware/driver combo. Not appealing for many projects.

Even if this could be done, this sort of 'compare snapshot' test is brittle, because, of course, we're talking about high level functional tests here, so you'd be snapshotting your whole game/application, not just limited aspects of it in a limited environment. Hence the screenshots would change all the time. Every time you added or modified any functionality you'd get a failing test and have to manually compare the images and assert that the differences were OK and then commit the new screenshot. This is ripe for overlooking small regressions, and makes subsequent bisection very difficult.

Of course, we haven't even got into the aspect that, as an end-to-end test, your test code would actually have to interpret the images and send mouse/key inputs to successfully play your game. Through to completion, of course - how else would you know your game-completion conditions were all wired up correctly?


I agree that there are generally problems doing this; we had thought a bit about doing this at a previous job when testing an AfterEffects plug-in we were developing, but we didn't actually do it.

Just wanted to add that one technique that could allow this to work better when testing across different kinds of hardware / driver settings would be to share high-level results (i.e. for release 1200, these images seem okay) rather than actual images among testers. (So each tester would generate its own "correct" images.) Yes, it is possible that some of the images this other tester assumes are correct aren't actually correct on their machine due to the hardware configuration. But if you care about this, you would not be able to share test results, anyway. And you could still test actual rendering on different kinds of hardware in a separate pass.


Actually, I do write functional tests and I have been for a long time. However, I'm not a developer ... I'm a failed developer and figured I could best continue to develop if I had a QA title. The ammo that I always could use for functional tests was that they reveal a different dimension of wrong than a unit or integration test. And when you have unit, integration and functional tests development speeds along at a pretty rapid clip.


An interesting aspect of the "different dimension" is that functional tests actually demonstrate that your application works. If your website is down after deploying a well-functionally-tested application, then it's often because of something beyond your control, like an Amazon outage. Functional tests give you the confidence to deploy often, or automatically, even after bold refactoring, because you can prove that the app works, which unit tests or integration tests cannot do.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact