Hacker News new | past | comments | ask | show | jobs | submit login

I would argue that "dependency injection" at its purist is little more than the side-effect of designing with intent to minimise module responsibility.

When writing a class, as a rule of thumb I'd posit there's a general advantage to outsourcing units of behaviour or complexity to other modules. Those other modules could be constructed by the class itself, but this is likely to make the current class concerned with the construction details of those modules. One of the simplest things to do, for example, is to ask for those modules to be provided to the class on construction.

This sounds like a great idea until somebody comes along and calls it "dependency injection" and a bunch of us lose our minds.




Fowler has a habit of inventing "special terms" for what people have long considered "programming", and people get sucked into some fundamentalism. I don't like how the author of this article appeals to authority when trying to justify the use of a service locator over DI.

> So if Martin Fowler says that it is possible to use a service locator instead of DI in unit testing, then who are you to argue otherwise?

This argument is the same as "It is possible to use a global variable instead of an argument when unit testing a procedure." That's really what a service locator is - a facade around global variables, unless of course, you use DI to inject the service locator, then you've not really gained anything, but just inserted another layer of indirection.

And that brings us back to why we even use techniques which are now labeled "DI" in the first place - they're basically there to avoid the use of globals (and hence, tight coupling). Interfaces are in place to keep implementations decoupled while providing everything necessary for them to interact.


> So if Martin Fowler says that it is possible to use a service locator instead of DI in unit testing, then who are you to argue otherwise?

There are good arguments against a service locator - one of them is presented here: http://blog.ploeh.dk/2010/02/03/ServiceLocatorisanAnti-Patte...

Another argument again the service locator pattern is this: If you ask the locator for a service which has dependencies you have to resolve those dependencies yourself. So you need to know about the specific implemntation of this service interface which defeats the purpse. If you work around that, you end up with something that is pretty close to a DI container.


I understand the backlash against Uncle Bob and a lot of other celebrities, but I think Fowler focuses on real, practical aspects and does not bullshit around.

> they're basically there to avoid the use of globals (and hence, tight coupling)

One could say that you merely delegate the management of the globals to the DI framework, just as you do with a ServiceLocator.


But in a DI framework, every object has its own name for its dependencies. With ServiceLocator, every object needs to use the ServiceLocator's names, so they are not encapsulated.


I see this argument pop up from time to time and it confuses me. It's almost as if someone read "global variables are evil" and understood it to mean "global data is evil".

Singletons aren't global variables. Neither are service locators. Nor databases. Nor screens. Nor keyboards. Global variables are global variables.

Unless they all are. And if you're going that way, you have a single main function, somewhere.


"Global variables" is really language specific terminology for global data. What I mean by "globals" in the scope of testing is "side-effects" - anything which can affect the behavior of some unit which you're trying to test beyond the test target and the values contained in your test. All of your listed "non-global variables" are examples of it - they make it more difficult to perform isolated unit tests because you need to bring the global state into the test, then you're no longer performing unit tests, but whole systems tests.

A database is a good example. One might have some class "Person" with some business related logic in it. If your goal is to simply test this business logic of a person, then why would your test be concerned about whether it can establish a connection to a database? By definition, this is no longer a unit test, but a systems test, and the test surface is much larger because now you need to be concerned about whether or not a connection can be established and myriad of other possible problems, all of which could be tested separately, and by knowing which tests succeed/fail, we can have immediate feedback on where some problems might exist, rather than having to debug an entire system to find an issue.

In the case of the service locator - you can't really perform "unit tests" on individual blocks of code which have dependency on the SL, because the SL is mutated in arbitrary places througout the codebase. The SL acts to increase coupling, because now instead of depending on just a specific interface, you depend on the whole runtime data of your application.


> "Global variables" is really language specific terminology for global data.

Strange, I thought it was terminology for global variables.

> What I mean by "globals" in the scope of testing is "side-effects"

Which is as whole different kettle of mackerel. Minimising side-effects is, of course, excellent advice, and can help with testing. But global variables and SL are not the same thing as side-effects.

Your example (of testing a database) seems very confused to me. You're talking about coupling now. Not global variables. Why would you suddenly need a database connection? Why does the existence of global mutable state mean that nothing in your code can be tested independently? You seem to be imagining the worst case of coupling as an argument against service lookup.

The last paragraph seems blatantly false. You're again straw manning this version of an SL that is mutated in arbitrary places through the code-base and therefore can't be tested in isolation, and neither can the components that use it. That is a strange version of SL you have here, and one that DI wouldn't help with.


>But global variables and SL are not the same thing as side-effects.

They're examples of side effects. It's not good enough to set the values of a global variable or register some service purely for the purpose of a test, because the test then does not reflect the runtime behavior of the code. The benefit of a unit test is to assert that code behaves the same way all the time - not just for specific values you use at the time of testing.

> Your example (of testing a database) seems very confused to me. You're talking about coupling now. Not global variables. Why would you suddenly need a database connection? Why does the existence of global mutable state mean that nothing in your code can be tested independently? You seem to have a strange idea of how software works.

Global variables increase coupling - code which consumes a global variable now has a dependency on all of the code which mutates it. You simply cannot test the consuming code in isolation without regard for the code mutating the variable, unless your test is exhaustive of every possible value which the global variable may contain.

My example was not of testing a database, it was about testing algorithms or logic that might exist inside some class named "Person", but which has a data dependency on an actual person (held in a database). If one wants to test the logic only, then mock data must be supplied instead of the real data from the database - else you're not testing only the person, but also testing that the database is connected and querying it is successful. The correct way to test this is to decouple Person from the database, usually be means of a mock object, or by passing the mock data into the person directly. Either way, it seems the blog author does not do such unit tests, as he doesn't use mock objects.

> I do not use mock objects when building my application, and I do not see the sense in using mock objects when testing. If I am going to deliver a real object to my customer then I want to test that real object and not a reasonable facsimile This is because it would be so easy to put code in the mock object that passes a particular test, but when the same conditions are encountered in the real object in the customer's application the results are something else entirely. You should be testing the code that you will be delivering to your customers, not the code which exists only in the test suite.

The problem with the author's philosophy is that it means when problems do arise in his applications, he must perform whole system testing/debugging to find them. He is missing perhaps the main benefit of unit tests - which is that, when a bug arises, you can quickly eliminate many possible causes because unit tests against those parts of code have succeeded (unless your unit tests were wrong to begin with, which will more or less be the case if they're testing against code which depends on globals).


You're making me feel very dumb. Because several of these seem to be the opposite of what I've observed.

Testing with a mock object implies that the mock object can generate all the required output that the real object can generate that might have some effect on the consuming code. Not only that, but it assumes that the mock object generates the correct data in ways that cannot generate false positives in the test. This doesn't mean you're only testing the client logic. You're now testing the client logic using services that are ad-hoc and aren't guaranteed to behave like the real thing. You're testing a fantasy.

It is far better to test against the real database. Using a fixture, or a transaction, or some way to use the actual system with representative data. Mocks have their place in very complex services where this is practically impossible. But they don't suddenly make things better for testing, or more atomic. IMHO, when you have to use a mock, it should be as a last resort, when you have to sacrifice fidelity for tractability. Your code is coupled in behavior to the services it uses, pretending it isn't is just fooling yourself.

I have very much the same problem with people who write unit tests against, say SQLite databases, rather than the full DBMS. The complexity of 'masking sure the database is connected and can be queried' is pretty trivial compared to the complexity of mocking a whole RBMDS interface. Good software engineering will, of course, limit the number of places the database interfaces with (I'm not suggesting code with SQL statements in strings everywhere, that's a straw man). But I'd not accept mocked tests that exists just to avoid a database connection or because the developer doesn't understand how to write a transaction.

So I don't understand. Either you're advocating a very bizarre, and seemingly pathological development style, or you're consistently muddying the waters by comparing good programming in your chosen methodology with bad programming in mine, which just misses the point.

Here's an example then. In your Person object, on a platform with reasonable transaction/fixtures support (like Django). Is it better to write your unit test using a mocked ORM layer, or a fixture with the test data in it?

> He is missing perhaps the main benefit of unit tests - which is that, when a bug arises, you can quickly eliminate many possible causes because unit tests against those parts of code have succeeded

I've no idea why this is somehow impossible. I write unit tests at various levels of abstraction. If I have module A, calling module B which calls module C, then I need tests for C, B(+C) and A(+B+C). If I get a failure in A, I make sure that there is a test in B that corresponds to the way A is using B, if so, it is a problem with A, not B. If B and C were mocked, I'd have no way of knowing if the problem was with the mock logic without having to test C, C-mock, B+C-mock, B-mock, A+B-mock.

> now has a dependency on all of the code which mutates it

This seems a bizarre claim. Does your code have a dependency on everything else that can possibly change what's on the screen? If so, how do you deal with that?

That's why pretending 'global variables' = 'all central resources' seems foolish to me.


I probably have quite a fundamentalist view on unit testing because I write primarily in purely functional code these days - where a "unit" is a pure function, and it's clearly an isolated unit. Even when I'm back in OOP world though, I basically avoid static variables/globals like the plague. Even where the framework or some library makes use of them, I'll tend to wrap them up and pass them into my code via Main, to make sure that no statics are globally accessible throughout the code.

If I were testing a salary calculation which takes values from a database, and I named my test "Test_salary_calculation_correct", where instead of using some sample data which could easily cover the range of values I need to test against, I instead relied on a database connection, and this test failed because the database was not accessible - I've only confused the developer who picks up my shit where "Test_salary_calculation_correct" fails, and he thinks there's a problem with my calculation rather than a misconfigured firewall somewhere else. The firewall has nothing to do with my salary calcuation - why should it have any effect on the test passing?

The way I see unit tests is this: If you write a test and it passes on your machine, then some other developer takes your code and the same test fails - it's a fuckup on your behalf. Unit tests should not depend on the environment in any way. Actually, by definition, a unit test is a test of a single "unit" - including database access into this is well beyond the scope of unit testing, but into integration testing.

To me it seems you're skipping unit testing and just going onto integration testing with your unit testing framework. I'm not sure what you've observed or where, but I can tell you it's certainly not standard or best practice in the industry. It might possibly tell you something about your own code style though - are you writing units which can be treated in isolation? (Certainly not if you depend on a SL, which is a global context of services with no clear boundary)

Ideally a codebase should be designed to maximize unit-testability and reduce the need for integration testing to as little as possible - since this is where most of the "unexpected", or "out of my control" problems are most likely to occur. This testing is more a case of "am I handling all the relevant exceptions" than getting green lights to pass in a unit testing framework. It doesn't really help to make unit tests against code which is expected to fail out in the wild due to whatever circumstance - what matters here is that your code is prepared for the worst and knows how to recover.

It's these cases where mock classes are particularly useful - because you can forcefully simluate any behavior from the external service and make sure your code is working correctly for all the potential circumstances. Having to rely on divine intervention to trigger some event that may only happen 1% of the time in the real-world situation is hardly practical. Unfortunately testing in the wild is often like this - everything works fine 99% of the time.

Even for cases where you're arguing for a fixture with real test data in (from a database), then the reasonable thing to do is extract this data beforehand and encode it into the unit testing language (which is fairly trivial to do). Now you have a reliable test which will continue to work as you update the code. Testing against live data is giving a false sense of security to begin with anyway. Imagine the scenario where you have a bunch of data in the database, you run your unit test against it with all green flags - then after deployment, somebody inserts into the database a value which your code doesn't expect. The unit test shouldn't be testing against real world data, but against data representitive of the possible values it should accept (ie, include all the obvious edge cases which should fail too, but are not likely to exist in the real world DB).


When people are referring to global variables these days, they would often be more accurate in referring to "global mutable state".

Singletons can be this, service locators can be this, databases can be this. This can be bad, notably when assumptions are made about the state without taking into account that you're not the only one that could be changing it. Depending on what's changing, this can lead to very confusing / seemingly unpredictable behavior.

Another way to put it is that the nature of the issue with "global variables" is often present in the things you mention.


I agree with your basic point, but I'm not sure you even need the mutability requirement to capture one of the underlying issues: you have an implicit dependency on something elsewhere in the system. Even if that something is constant, or at least constant during any particular program run, it still means you can't change the code that sets it up without checking your entire code base for unintended consequences. From this point of view, mutability just makes an existing fundamental problem worse, though it introduces new problems as well.


I'm not following what you think you are avoiding and how you think you are avoiding it. For example, I used to work on radar signal processing code. We had several global constants that were used in various calculations in the processing pipeline, one of which was the speed of light at various altitudes. Regardless of the manner in which these constants were inserted into the program, changing one of them was going to have direct effects on several calculations and ripple effects on others. It is just the nature of radar equations.

Thus, if you have a program that has an immutable global constant it is likely in the same situation. How do you ever change that value in any way without retesting everything to see what the effect was?


Thus, if you have a program that has an immutable global constant it is likely in the same situation.

I don't accept your premise.

Firstly, constants are used for all kinds of things, from hard mathematical data as in your example to configuring connectivity with external services via particular database details or URLs.

Secondly, while some programs really do only have one main purpose and so some of the background/context data really is almost globally applicable, this is certainly not always the case. In particular, the larger a software system becomes, the less likely this is to be true.

How do you ever change that value in any way without retesting everything to see what the effect was?

If the value doesn't have global scope, you don't have to retest everything, only parts of the system that can be affected by the change.


> I'm not sure you even need the mutability requirement to capture one of the underlying issues

I don't think you do either, although if it's a globally accessible constant throughout a run, problems with inconsistencies only tend to crop up if the value is being used to determine behavior -- which you might need to check for, as you mention.

I'm not sure if global mutable state is the same as the problem you point out though -- they do share a similar nature in the framing of dependencies, but they cause problems for different reasons.

Also, mutability tends to be able to be a real problem in systems much smaller than ones where problems from abuse of global constants start to rear their heads.


I'm not sure if global mutable state is the same as the problem you point out though -- they do share a similar nature in the framing of dependencies, but they cause problems for different reasons.

I agree, there are (at least) two distinct problems here.

One is the existence of any implicit dependency because of global (or other broad) scope. This can create surprising spacial interactions in the code, making it harder to maintain.

Another is the existence of mutable state. This can create surprising temporal interactions in the code, also making it harder to maintain.

Loosely speaking, the danger from these effects multiplies.


Why is it implicit? The point is that it is much more explicit.

And use of SL doesn't create a dependency that isn't there otherwise. The dependency will be there anyway, because bits of code do depend on one another. The point is, do you want those dependencies to be managed by another piece of code and another piece of configuration? Or is it better to say what you need in the code where you need it?

I'm not religious about it, there are times when being implicit is better, but often it is dramatically more complex for no obvious benefit.


Why is it implicit?

Because if we have module A depending on module B via some clearly defined interface, but in fact the behaviour of module B also depends on global module C that is set up elsewhere, then B's interface no longer fully describes what A can expect.

If C's scope were limited to what's happening within B anyway then this would just be an implementation detail. If C were given to B as some form of explicit dependency by A, then it would be a specified part of the interface. However, if C effectively has global scope by any mechanism then there is now an implicit interface to change B's behaviour that A doesn't know about. Developers reading or maintaining the code for A, C, and anywhere else that can affect C if it's mutable, then need to be aware of what each other are doing, and changes to any of these parts of the code potentially affect any of the others.

(I'm not going to address your other point, because I didn't say anything about service locators in the first place.)


Yeah. I only heard about "dependency injection" a few months ago, and my reaction was that my brain just didn't get it, because, like, why are you making a huge deal about such a simple thing? If we made this much of a big deal out of every idea in programming, we would never be able to get anything done.

Since then I keep hearing about "Dependency Injection" so my impression is that it's gaining in popularity. But my kneejerk reaction is always that if someone is talking about this subject, they probably are not a very good programmer, just like if someone is talking about how important UML diagrams are. It is maybe a hasty conclusion but that is where my brain goes.


Good use of DI can help one write a more comprehensible, lighter, more flexible system, where each class is responsible for doing just one or two reasonably-scoped tasks. Unlike, say, an obsession with UML, DI is _not_ the hallmark of a crap-grade programmer. Nor is it all that simple, in the sense that it can transform your way of thinking about runtime configuration and object lifecycle management, and promotes a more flexible mindset.

That said, I absolutely despise autowiring and annotation-based DI...


Strongly agreed especially with your last sentence.


Could you provide a concrete example of a good use of DI?


In Java-land, let's say you have a class that depends on a micro-ORM library (or just a database connection, or whatever); the micro-ORM depends on a database connection, which might be yielded by a datasource, which in turn might be wrapped by a bounded connection pool.

DI lets you push all that stuff to configuration rather than wiring it up in code; it reduces drag during development because you can always say, ah, I don't need to think about how or where the micro-ORM boots up, I can tune it later... I can even share the connection pool across five different consumers that are not aware of each other and share no compile-time dependency.


if I had knee jerk reactions, one would be that programmers who have knee jerks reactions aren't very good programmers




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: