Hacker News new | past | comments | ask | show | jobs | submit login

At least there are tests!



Tests that run for 30 hours is an indication that nobody bothered writing unittests. If you need to run all tests after changing X, it means X is NOT tested. Instead you need to rely on integrations tests catching Xs behavior.


I beg to differ. Having to run the full test suite to catch significant errors is an indication that the software design isn't (very) modular, but it has nothing to do with unit tests. Unit tests do not replace service/integration/end to end tests, they only complement them - see the "test pyramid".

I think it's important to point this out, because one of the biggest mistakes I'm seeing developers do these days is relying too much on unit tests (especially on "behavior" tests using mocks) and not trying to catch problems at a higher level using higher level tests. Then the code gets deployed and - surprise surprise - all kinds of unforeseen errors come out.

Unit tests are useful, but they are, by definition, very limited in scope.


(terminology nazi mode)

"... is an indication that the software design isn't (very) decoupled ".

You can be modular without being properly decoupled from the other modules.


Hmmm... you have a point, but then, shouldn't it be "decoupled into modules"?


But then are your modules really modular?


In a C/C++ world, a module is usually defined as a file/.dll/.so on disk. So highly-coupled modules are still modules.


Always test "outside-in", i.e. integration tests first, then if you can afford it, unit tests.

Integration tests test those things you're going to get paid for... features & use-cases.

Having a huge library of unit tests freezes your design and hampers your ability to modify in the future.


While I see some value in the red-green unit testing approach, I've found the drawbacks to often eclipse the advantages, especially under real-world time constraints.

In my day to day programming, when I neglect writing tests, the regret is always about those that are on the side of integration testing. I'm okay with not testing individual functions or individual modules even. But the stuff that 'integrates' various modules/concerns is almost always worth writing tests for.


Love the idea of this.

In my experience it's far easier to introduce testing by focusing on unit testing complicated, stateless business logic. The setup is less complex, the feedback cycle is quick, and the value is apparent ("oh gosh, now I understand all these edge cases and can change this complicated code with more confidence"). I think it also leads to better code at the class/module/function level.

In my experience once a test (of any kind) saves a developer from a regression, they become far more amenable to writing more tests.

That said I think starting with integration tests might be a good area of growth for me.


In general, I test those things that relate to the application, not those about the implementation.

i.e. Test business logic edge-cases, don't test a linked-list implementation... that's just locking your design in.


Writing functional tests is easy when you have a clear spec. If you do, tests are basically the expression of that spec in code. Conversely, if they're hard to write, that means that your spec is underspecified or otherwise deficient (and then, as a developer, ideally, you go bug the product manager to fix that).


Right, like I said "complicated business logic". I agree completely, I have no desire to test a library or framework (unless it's proven to need it).


For those who want to know more about this approach: see Ian Cooper - TDD, Where Did It All Go Wrong https://www.youtube.com/watch?v=EZ05e7EMOLM

I post this all the time, it's like free upvotes :)


Integration tests are pretty important in huge codebases with complex interactions. Unit tests are of course useful to shorten the dev cycle, but you need to design your software to be amenable to unit testing. Bolting them onto a legacy codebase can be really hard.


In large systems you can unit test your code within an inch of its life and it can still fail in integration tests.


Exactly. I have a product that spans multiple devices and multiple architectures: micro controllers, SDKs and drivers running on PCs, third-party devices with firmware on them and FPGA code. They all evolve at their own pace, and there’s a big explosion of possible combinations in the field.

We ended up emulating the hardware, run all the software on the emulated hardware, and deploy integration tests to a thousand nodes on AWS for a few minutes it takes to test each combination. Tests finish quickly and it has been a while since we shipped something with a bug in it.

But there’s a catch: we have to unit test the test infrastructure against real hardware - I believe it’d be called test validation. Thus all the individual emulators and cosimulation setups have to have equivalent physical test benches, automated so that no human interaction is needed to compare emulated output to real output. In more than a few cases, we need cycle-accurate behavior.

The test harness unit (validation ) test has to, for example, spin up a logic analyzer and a few MSO oscilloscopes, reinitialize the test bench – e.g. load the 3rd party firmware we test against, then get it all synchronized and run the validation scenarios. Oh, and the firmware of the instrumentation is also a part of the setup: we found bugs in T&M equipment firmware/software that would break our tests. We regression test that stuff, too.

All in all, a full test suite, run sequentially, takes about 40,000 hours, and that’s when being very careful about orthogonalizing the tests so that there’s a good balance between integration aspects and unit aspects.

I totally dig why Oracle has to do something like this, but on the other hand, we have a code base that’s not very brittle, but the integration aspects make it mostly impossible to reason about what could possibly break - so we either test, or get the support people overwhelmed with angry calls. Had we had brittle code on top of it, we’d have been doomed.


If you're only writing tests at the unit level, you might as well not bother. And it's always good to run all tests after any change, it's entirely too easy for the developer to have an incomplete understanding of the implications of a change. Or for another developer to misuse other functionality, deliberately or otherwise.


It could also be that the flags are so tangled together that a change to one part of the system can break many other parts that are completely unrelated. Sure you can run a unit test for X, but what about Y? Retesting everything is all you can do when everything is so tangled you can’t predict what a change could effect.


> Tests that run for 30 hours is an indication that nobody bothered writing unittests.

Yes, they were not unit tests. There was no culture of unit tests in the Oracle Database development team. A few people called it "unit tests" but they either said it loosely or they were mistaken.

Unit test would not have been effective because every area of the code was deeply entangled with everything else. They did have the concept of layered code (like a virtual operting system layer at the bottom, a memory management layer on top of that, a querying engine on top of that, and so on) but over the years, people violated layers and wrote code that called an upper layer from lower layer leading to a big spaghetti mess. A change in one module could cause a very unrelated module to fail in mysterious ways.

Every test was almost always an integration test. Every test case restarted the database, connected to the database, created tables in it, inserted test data into it, ran queries, and compared the results to ensure that the observed results match the expected results. They tried to exercise every function and every branch condition in this manner with different test cases. The code coverage was remarkable though. Some areas of the code had more than 95% test coverage while some other areas had 80% or so coverage. But the work was not fun.


I'm amazed that it had that many tests that took that long, but ONLY had 80-95% coverage. I understand the product is huge, but that's crazy.


I do. It's about state coverage: every Boolean flag doubles the possible state of that bit of code: now you need to run everything twice to retain the coverage.

FWIW, I know people who work on SQL processing (big data Hive/Spark, not RDMBS), and a recurrent issue is that an optimisation which benefits most people turns out to be pathologically bad for some queries for some users. Usually those with multiple tables with 8192 columns and some join which takes 4h at the best of times, now takes 6h and so the overnight reports aren't ready in time. And locks held in the process are now blocking some other app which really matters to the businesses existence. These are trouble because they still "work" in the pure 'same outputs as before', it's just the side effects can be be so disruptive.


Writing tests for error handling can be a pain. You write your product code to be as robust as possible but it isn't always clear how to trigger the error conditions you can detect. This is especially true with integration tests.


How about XS Max =]]




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: