There are some pretty basic strategies to make building "large" software easier. All of them were attempted, but in an entirely broken way:
- Break the architecture down into smaller components that can be tested independently.
- Break the functionality down into smaller features that can be released independently.
- Develop automated tests at the system level (i.e. integration tests, not unit tests).
The software took 2-4 minutes to launch in development, and very few of us knew how to meaningfully test it. This meant that any change that compiled (and passed code review) would ship.
Oh yeah - and the system used binary structures in shared memory to communicate across processes, which were running code written in two entirely different languages.
What could go wrong
"We should just have used binary for everything!" - said every C/C++ developer ever.
"We should just have used Json for everything!" - said everyone currently.
2. Requires code generation. Not a big hassle especially if your IDE hides it, but a hassle nevertheless.
I've never understood the objection to code generation (or IDLs). If you actually care about what the serialized form looks like (and compatbility, etc.) this is by far the sanest way to do it.
It's also incredibly simple (i.e. transparent to inspection), it's a good way to do cross-platform/language marshalling.
Unless you have full control over the delivery tool-chain, then generation is not much of a problem, even less so if your stuff is open source.
But what if you're building software that generates code for long running projects/code bases (e.g. aircrafts, defence, power-plants, telecom equipment, medical devices, mobile radio base stations). Selecting some closed source tool to generate foo can cost you dearly.
You have to version these tools (e.g. the binaries if you don't have the source) to the generaters in your VCS to provide reproducability. Ok you can also ignore all this and simply version what was generated and lose control over the generator but that is a horrid idea and you would never do that of course.
Yes, that would be insane. So don't do it? If you willfully chooses badly suited tools, then you get what you deserve.
> You have to version these tools (e.g. the binaries if you don't have the source) to the generaters in your VCS to provide reproducability.
You can vendor source code generators. If you want to be extra-certain that generator output remains the same across upgrades, you can also check-in the generated code and manually inspect any changes that happen. (I'm assuming relatively sane source code generators that don't intentionally obfuscate things.)
So, yes, if you choose a terrible code generator then you'll be in trouble.
What alternative approaches would you suggest, incidentally, where you're not also just as much at the mercy of the whims of the tools you're using?
You can get around the strict typing by adopting a schema format, but then you have a tooling problem that all the languages you need to support may not support your chosen JSON schema format.
Other solutions such as protobufs exist and are far better for transmitting large amounts of binary data.
Sharing c structs across languages is brave but it is doable as long as padding/offsets/lengths are expertly crafted and very strictly controlled.
but on the other hand sharing pointers would be madness.
Suffice to say, I wasn't the least bit surprised that it was a complete and total catastrophe, and even less surprised that Oracle tried to sue Oregon for attempting to use the software. Nothing like LDD - Lawyer Driven Development.
I don't understand that
The sad thing is it shouldn't have been such a difficult thing to develop. Complex sure but no more complex than many other similar systems. It was destroyed by too much meddling from the government adding, removing and changing the spec every other day and many, many managers just finding non-sense jobs that were totally pointless.
It is shocking it went so over budget and was allowed to continue doing so for so, so long.
HSC Northern Ireland
It is as far as I've seen the 'greatest' fuckup in UK development history, 12.something billion down the drain, massive overspending, they basically made every mistake you could make short of setting the office on fire.
This would involve either:
(1) Clear and unambiguous definition of "mismanagement" involving distant-from-the-specific-project procedural mandates, which would reduce "mismanagement" by the definition applied but increase (and mandate) what would be functionally mismanagement for specific real projects (lots of government regulation designed to prevent mismanagement all around the world, while usually not specifically felony criminal rules, works this way now), or
(2) Being so ambiguous and vague so as to provide a basis for arbitrary prosecution, such that it would make sure no one wanted to touch management of a government project at all (at least -- assuming this was the US or one with similar fundamental rules as to what can be an enforceable criminal law -- until it was inevitably ruled unenforceable as impermissibly vague.)
Much of that waste is a product of the "higher level of control" adopted, in law and policy governing government IT work, in response to previous failures, which has mandated additional bureaucratic process, and causes more and more decisions to be made farther and farther from the people with either the specific business knowledge or the specific technical knowledge of the project being executed as more layers of "higher levels of control" are implemented.
I am not really sanguine about the prospects of more of the approach that has made things worse suddenly making things better instead.
They've been rarely used until more recently, when it's been a favorite for prosecuting police officers and police employees for egregious behavior resulting in deaths and injuries (where traditional charges such as manslaughter have failed to convince juries who are always reluctant to convict). The other favorite use is to prosecute civil servants for speaking to the media (aka. leaking).
However, no sign of it being used to deal with mighty screwups on government IT. I do like your thinking on this.
The latest project to fail was a system for the Danish Tax Office. It was suppose to automatically collect debt owed by citizens and companies.It's not entirely certain how that was suppose to work and it has been described as a utopia project with an unprecedented expectation of what kind of problem computers are capable of handling. The specification where 9000 pages long, tax law not included, and some how failing to supply a sufficient amount of actual requirements. It failed, of cause, costing almost $150.000.000 (DKK 1.000.000.0000), not including the 78 Billion Danish Kroner the tax office haven't been able to collect due the system not being delivered on time and generally not working.
What's sad is there's lots of smart coders in Denmark, a small team of which could easily do ANY of the many failed projects.
But because you need a certain certification by the bureaucracy, what tends to happen is one of the major players gets the mega contract and splits it with some of their friends, who split their piece with their mates, and so on until you have an organisational ball of spaghetti. And remember, the code will end up looking like the organisation.
You know the old adage "nobody gets fired for buying IBM".
I have a friend that's partner at a consultancy in Copenhagen and he says IBM is often their competition.
The government knows that it's illegal, yet CSCs contracts have automatically been renewed, multiple times.
But yes, IBM have a large number of contracts as well. CSC just seems to be behind the worst projects. I suspect they bid for project and just focus on delivering exactly what's in the contract, knowing that it will never work. Either that or they are even more incompetent that anyone could imagine.
From my discussions with that friend I mentioned though it sounds like the attitude is changing and there's more work going to smaller shops like his (though friendships with government officials seems to help loads when it comes to securing work). Honestly it sounds like a really exciting environment to work in when we talk about it and I wish I had the time to really learn Danish so I could have a decent chance of securing good work out there.
Seriously, someone fill me in please. How can a software project possibly ever cost more than a billion pounds? You could employ 20,000 software engineers full time for £50K ($70k) a year for that much, and I expect the number of engineers working on Universal Credit is much lower than that and paid less.
Just because one team of developers could do something in 200 man hours doesn't mean you can't spend and bill 2000 for it.
And that's just development. Data migrations and user training/support are going to be massive issues if you replace or newly introduce a widely used system.
Yup. My employer is undertaking a massive IT project (many many many millions of USD), and it's my understanding that training accounts for over half of the budget.
It doesn't mean you should pay, or plan, to spend 20,000 man hours in meetings and devote huge chunks of your budget to expensive licenses for something that could take 200 man hours, especially when it's taxpayers money.
Sure there are unavoidable overheads but why does our government keep splashing billions on them and failing, with little accountability?
Training is going to be expensive, sure, but perhaps plow millions of pounds into that after the program is finished. Maybe don't use the same massive, costly consultancies that fail to deliver? The Obamacare website was a great example of this IMO. Costly consultancy failed to deliver, small agile teams brought in to save the day - if only a bit too late.
Here is the simplest example. Say you need a device that does X. Now, I can go down to the local hardware store and pick up a device that does X (and Y and Z) for $40. But the law is written so that I can't just buy it where I want, because I might be wasteful and I might buy it from my buddy. So instead, I have to put out a bid that says I want a device that can do X and only select whom ever wins a bid. But there are a lot of regulations in who can bid, so the local hardware store doesn't ever even try. So I say I want a device that does X, and I end up getting a couple offers, the best of which is a device that barely does X for $200. That is the best offer I receive, so I have to take it. So now I have a much worse device at a much higher cost. Now, multiply this with every single thing you need to pay for in a software project (including other software, licenses, hardware, manpower, even the paper to print specs on).
OK, after we did the bid, and that's the result, why can't i just go buy something if it's less than half the price ? even if i buy it from my buddy, as long as it fits ?
That's the theory anyway.
*I know UC is British and while I suspect they have similar constraints I won't swear to it.
Which, IMO, are extremely important investments in any 20,000-developer undertaking. Valve Software is famous for having a relatively flat hierarchy, but they also have ~330 employees.
And this is before the lawyers get involved with their reams and reams of regulations that all make your job more difficult. If you think you could just dump the database and load it on the new system think again, that's an unacceptable data risk, you need to broker all of the data through yet another system.
By the way you cost your employer a LOT more than your salary. There's taxes, rent on the building, utilities, equipment and supplies, support staff, overhead staff, yadda yadda.
2. Advertising/marketing budget is in there somewhere.
3. So are acquisitions (Oculus etc.).
4. Operational costs for their datacentres, ops staff etc.
> Whilst original estimates from the DWP planned for the administration costs for the rollout of Universal Credit were £2.2billion, by August 2014 this estimate was revised to £12.8billion over its "lifetime". The cost has since increased to £15.8 billion. At that date only 7,000 claimants were receiving UC, although there are now over 175,000.
The cost of the rollout is £15.8B. Its customer base is currently 175K. There's no indication that costs won't increase when the customer base increases by an order of magnitude. Besides simplifying Facebook's business model and operations (akin to calling Google just a big web scraper), you seem to be underestimating what's up ahead for UC, or you seem to think that software is just built and then maintains its value and functionality with no future costs whatsoever.
That's an entire rollout which entails designing, building and operating a payments system in parallel with several different existing systems it replaces, all the extra non-dev staff needed to keep things running and identify who gets which payment entitlements when, and all the additional administrative costs, which presumably involves an awful lot of explaining to the public how their benefit entitlements have changed. Actual software development probably accounts for a relatively small proportion of that. The status quo admin costs £3.5bn per annum, so an extra billion a year actually isn't as big a deal as it sounds.
If we can simplify all that down to a "bloated Java app" then we can certainly dismiss Facebook as a PHP web forum and suggest that its running costs don't compare particularly favourably, even if it does serve an awful lot of ads.
Judging by the reports of delays and admin cockups, if anything the UK government has probably underspent.
> Richard Granger the former Director General of IT for the NHS, took up his post in October 2002, before which he was a partner at Deloitte Consulting, responsible for procurement and delivery of a number of large scale IT programmes, including the Congestion Charging Scheme for London. In October 2006, he was suggested by The Sunday Times to be the highest paid civil servant, on a basic of £280,000 per year
Deloitte are a cancer on society and business. Presumably he learned how to ruin companies there/
- We are going to fine you if you don't deliver on time
- Ok, but then we have to tripple our price
In particular, "The Staggering Impact of IT Systems Gone Wrong"  and "Monuments to failure"  are good lists of failures, many of which are not included on Wikipedia's list.
For my money, a real failure is like the giant BART fiasco in the 90s where they spent 80 million dollars on a new train control system that simply never existed, at all. They had to sue General Electric to get their money back.
If it didn't work? Hopefully not.
But military stuff is a bit different. Military systems tend to try to push the state of the art as far as possible, for as much edge in combat as possible. That makes for much higher chances of failure.
* I've been on projects where we're rewritten applications from the ground up that were totally successful and we were better able to build in the future because we had a clean setup.
* I've seen government IT shops hire contractors to do a large project and ended up with something better than the internal devs would have done.
Both of these are situations where the opposite case (time-consuming and expensive failure) gets a lot of attention. (and, in my experience, failure is more common). Show me one of those failures and my instinct is to say "Of COURSE that was a failure! It was obviously going to fail!". But, in reality, not always.
Rather than building unhelpful conventional wisdom, we should be trying to understand the distinctions between the failures and the successes.
Absolutely shocking how ineffective the system of contractors and RFPs is. Part of it is a mindset, part of it is just needing better options.
Anyone in charge of a budget is potentially corrupt. I'd argue that it is more common in corporates than in government, because the personal damage that could be inflicted is less. A company doesn't want reputation damaged by being involved in fraudulent bidding (so will sweep under the carpet readily), and as long as the job gets done, nobody cares (unless internal politics get stirred). The employee may be fired, but they can go somewhere else. And all is off the public record.
* The Distributed Computing Environment (or at least the OS/2 portion of it, anyway), https://en.wikipedia.org/wiki/Distributed_Computing_Environm...
* Taligent, https://en.wikipedia.org/wiki/Taligent
* IBM Workplace OS, https://en.wikipedia.org/wiki/IBM_Workplace_OS
And that's just a few from off my personal resume.
Well isn't that ironic.
Waste Management: http://www.computerworld.com/article/2517917/enterprise-appl...
(Okay, yes, the software itself is a problem too but it's not the major problem)
But outside of that one thing.. lol
The UK tried a NOMS (offender management) system and that was an expensive disaster too..
Would be nice to put down the contracting companies who were ripping of the Govt along with the civil servants who over saw the massive waste of tax payers money/time.
I'm quite sure there are some spectacular failures in the corporate world that won't make the list.
The ones you hear about are the ones the government ministers foolishly chose to boast about in some answer in parliament or announce in some TV interview. Normally that's when consultants pounce on it and offer to help make it succeed [i.e. vampire all the money away in insane fees]. A minister announcing it is practically the death-knell of a project.
The ones which succeed generally keep publicity and announcements to an absolute minimum and keep tight control. That's way more difficult than you imagine on a government project as civil servants go hunting for projects to buddy up to in order to pad their promotion cv. If your project budget is more than the last one they screwed, they're all over it so they can boast about the incremental increase in projects they contributed to - a major promotion point (as no-one checks if the project succeeded, the promotion is hinged on how much public money it pissed away).
I was ecstatic to leave UK Gov projects behind.
When a piece of private software flops, it still exists as a product and however bad, often offsets some of its cost in sales.
Since most public software is offered for free, no cost is visibly recouped.
Healthcare.gov is actually a great example. It was a colossal failure at first. A quantity easily measured in dollars spent. However, it has largely functioned in the years since but measuring its positive economic impact is hard because the service is delivered free. None if that money has been reclaimed but it certainly has had some positive economic value.
With all these companies running on Excel macros, years old SAP R/2 or 3 (just two common examples) and so on, the list will be extensive.
(COTS = commercial off-the-shelf)
It's a fascinating listen into the psychology behind gigantic failures.
My personal experience has been that it almost always boils down to upper management/leadership issues, that result in hiring weak project managers which in turn leads to talent drain on the implementation side.
I estimate that most of these have occurred in apparatus of the state, including organizations that superficially are not but for all practical purposes (funding, license, regulation) are.
Wise man once say: there's nothing easier in this world than spending someone else's money.
A startup could probably build the basic version of many of these apps no problem, but that's only the tip of the iceberg. When you have to then import millions of customer records, which were not stored in any particular format (they're idiosyncratic with each doctor/office), and then integrate with the COBOL based system vendor 1 uses, and the IBM Mainframe that vendor 2 uses, oh, and those systems were never intended to interconnect with anything and the vendors are years behind schedule as well, and all while you have lawyers breathing down your neck with a 28 inch thick binder with all of the regulations you need to follow written in impenetrable legalese you can see how it can cost billions of dollars.
Your startup guys heads would explode if you asked them to write a formal specification for an interconnection system between their elegant web frameworks and a big blob binary COBOL based record system on an IBM 360 with only hardcopy documentation written by a vendor that went out of business in the 80s. And yes, healthcare systems still run in environments like that for the reasons mentioned above: making a new system is too hard/expensive.
Oh, and the vendor that had the lowest bid for the contract is planning to outsource to some country that doesn't speak English and is 12 hours offset from your local time.
Maybe because that particular Wikipedia page's editors are from the UK or US? [edit: just speculating, don't know this for a fact]
If I recall correctly, the Gun Registry  in Canada was a huge, expensive debacle, with a lot of the costs coming from the underestimation of the software development.