
Warning signs for TSB's IT meltdown were clear a year ago, according to insider - YeGoblynQueenne
https://www.theguardian.com/business/2018/apr/28/warning-signs-for-tsbs-it-meltdown-were-clear-a-year-ago-insider
======
cleansy
Yeah, these things don't "just happen". I cannot imagine even in a relatively
low stakes environment (let's say a photo sharing app) to fuck up that badly
without getting a heart attack.

I am in Spain right now and I have to get cash from an ATM soon, it feels like
russian roulette to do so. I had to use a money transfer service UK internally
to pay a dentist bill, because TSB's online banking website was showing an
outdated phone number of mine that is used to verify transactions for new
recipients. And of course, when you change the number you cannot use the new
number for 2 days because of "security concerns".

IBMs involvement in the case doesn't fill me with confidence either.

I really hope this disaster is finding its way into MBA courses as an example
of why you need a sane migration path, "no matter" the costs.

EDIT: removed the request for recommendations for new banks, should be a
different thread.

~~~
ethbro
Presumably they plunked down the cash to hire an IBM A-level team, not the
turn the crank-level most people gripe about.

~~~
Already__Taken
This is funny because in the uk an A-level team would be a bunch of 19 year
olds thinking about going to university.

So I'm still not sure what you intend to say about the quality of IBM

~~~
DC-3
s/19/18 and 17

------
thisisit
I worked on a bank migration project, early on in my career, and these things
are a nightmare all around.

First, building something like this requires an acute understanding of banking
software. And banking software means something written in COBOL, RPG400 etc.
These languages are pretty old and hence, finding talent for these is like
trying to find needle in a haystack. So, most of the stuff has to be done via
bruteforce, trial and error. So most of the analysis provided by expert and
"senior" business analysts is just that, analysis. Engineers have to bang
their heads against the wall to glue stuff together.

Secondly, everyone has to be on the same page. The idea has to that customers
are priority and egos are not.

So stuff like this shouldn't happen at all:

> To make matters worse, the Sabadell development team did not have full
> control – and therefore a full understanding – of the system they were
> trying to migrate customer data and systems from because Lloyds Banking
> Group was still the supplier.

In our case, the team managing the original product made it difficult for us
to merge customer data. They would frequently seed the data incorrectly and
refused to provide proper data dictionary. They demanded training on the new
product and that all data transformation should be done by them.

Needles to see after spending 3 year and 100s of millions, the project was
scrapped. The migration was never completed and both banks remained on their
respective systems, kicking the can down the road.

~~~
arthurfm
> banking software means something written in COBOL, RPG400 etc. These
> languages are pretty old and hence, finding talent for these is like trying
> to find needle in a haystack.

It's really interesting to see which of the new UK and European 'challenger'
banks purchased third-party banking software and which decided to spend the
time writing their own modern banking systems [1]. A few examples from the
linked article...

 _Monzo: For banking ops, it decided to build its own platform. Technology
used is mainly open source: Linux, Apache Cassandra distributed database (used
by the likes of Apple and Twitter), Google’s Go (golang) programming language
at the back-end and PostgreSQL relational database. The system is hosted at
two data centres in the UK on Mondo’s own hardware. There is a team of 16
people working on this._

 _Atom Bank: It has created a hefty technology set-up in its run up to the
launch: FIS’s Profile core banking system; FIS /Sungard’s Ambit Quantum and
Ambit Focus for treasury and risk management; Iress’ Mortgage Sales &
Origination (MSO) suite for mortgage business, front-to-back office; Wolters
Kluwer’s OneSumX for regulatory reporting; Intelligent Environments (IE) for
front office capabilities; CSC’s ConfidentID system for security; Phoebus
Software for secured business lending and account servicing for residential
lending; and WDS Virtual Agent for customer queries supplied by WDS (a
subsidiary of Xerox)._

 _Starling: The bank has an in-house developed core system. It uses GPS and
Bottomline Technologies for processing and payments operations, respectively._

At what point does it become cheaper to buy one of these smaller banks and
migrate everyone across to their platform?

[1] [https://www.bankingtech.com/2018/04/uk-challenger-banks-
whos...](https://www.bankingtech.com/2018/04/uk-challenger-banks-whos-who-and-
whats-their-tech/)

~~~
obeattie
I’m Monzo’s Head of Engineering. A lot of the technologies mentioned above for
us are accurate but some aren’t in use - notably PostgreSQL - and some others
which are crucial are missing - eg. Kubernetes. The vast majority of the
platform runs on AWS. Our engineering team has grown to around 60 now.

Our entire backend has been built in-house using modern tech. You can read a
bit more about it on our blog if you’re interested:
[https://monzo.com/blog/2016/09/19/building-a-modern-bank-
bac...](https://monzo.com/blog/2016/09/19/building-a-modern-bank-backend/)

~~~
collyw
Modern doesn't necessarily mean good. AWS hasn't been without its share of
problems. "Tried and tested" is something I would be looking for with banking
software.

Saying that TSB's stuff has just gone titsup and as far as I am aware they
aren't using anything too trendy.....

------
yomly
Maybe it is part of crafting an article but did anyone else find the contrast
of the looming tech disaster and team champagning really unfair?

The issue was obviously a systematic one, and that 18 month slog would have
been a horrific death march for the people actually working on the project so
why shouldn't they be able to celebrate it being "over"?

Granted, it was a failure, but I'm not really sure what the floor staff - the
actual "software engineers" who were pictured had to do with it when they were
set an impossible goal to begin with...

~~~
snarfy
Fixed staff size. Fixed deadline. Fixed feature set. Something was going to
give. You can't have all three.

~~~
tootie
If you fix scope and timeline, then the thing that has to give is quality. It
doesn't seem like they failed to deliver all the parts of their system, they
just didn't all work correctly.

------
lowken10
I worked for Lloyds TSB around 2003/2004\. In banking the domain knowledge
(banking & finance) is more valuable then the technology knowledge. There was
a guy at Lloyds who couldn’t write a line of code, but he know every field and
every column and what that field meant and why it was there.

This guy was as close to unfireable (is that a word) as it gets.

~~~
danburbridge
That sounds familiar. I was there too around 2003-2005, very little of what
has been said surprised me and rang a lot of bells from my time there. Hugely
siloed and with very strict hierarchies.

~~~
sizzle
This is why I'm leaving fintech for good. Is healthcare worse??

~~~
tim333
Having read a bit, yeah it seems worse. At least banking is about numbers you
can put in a spreadsheet, health less so.

------
tonyedgecombe
They announced this week they are going to bring IBM in to help resolve the
problems, somehow that doesn't fill me with confidence.

~~~
wiredfool
Now they have N+1 problems.

~~~
votepaunchy
This could even be 2*N.

~~~
BillinghamJ
N!

~~~
noir_lord
(N^N)!

------
S_A_P
Systems integration is one of those problems that is new every time because
every environment is different. Sure patterns arise and sometimes they can be
cut/pasted between organizations but most of the time there is just enough
difference to make it a huge risk. I have written at least a dozen
integrations between SAP and commodity trading platforms and the mantra DRY
doesn’t really apply. You start over from scratch each time. Sure I know more
about the quirks of the various systems but just because it works at the last
place doesn’t mean it works now.

All of these systems are moving targets as well at various version/patch
levels so the best way to estimate projects of this nature is to take a
conservative estimate and double it. Then add 50%.

------
lordnacho
> When TSB split from Lloyds Banking Group (LBG), a move forced by the EU as a
> condition of its taxpayer bailout in 2008, a clone of the original group’s
> computer system was created and rented to TSB for £100m a year.

> That banking system was a “bodge of many old systems for TSB, BOS, Halifax,
> Cheltenham and Gloucester and others” that had resulted from the “nightmare”
> integration of HBOS with Lloyds as a result of the banking crisis, according
> to one insider who had extensive access to and intimate knowledge of LBG and
> TSB’s internal systems over a prolonged period.

That sounds completely crazy. If you've got £100m a year in IT budget, why on
earth would you buy a clone of Frankenstein?

You could hire a fine team of devs to build you a modern system. Then again,
I'm not the kind of guy who believes in "never rewrite" which seems to be the
advice.

> On Thursday he admitted the bank was on its knees, announced that he was
> personally seizing control of the attempts to fix the problem from his
> Spanish masters, and had hired a team from IBM to do the job.

This doesn't give me much confidence, either. Hiring outside help is a Coase
problem. You're going to find frictions dealing with the externals. And it
will cost you, I'm guessing, at least £100M a year.

With that kind of budget, and with IT being more or less all a retail bank
does, you should hire hundreds of experienced staff, make them integral to the
business, and let them solve the issues as they appear to the business units.
When things happen they will have an idea of what the priorities are. There
are plenty of software people who understand how banking works, and what
systems are needed. Go and hire them.

~~~
Stranger43
And shut down your entire operation while said team worked?

The problem is that the average banks system is a kind of Frankenstein tree
that have grown inside and around every policy, procedure and task the bank
performs without any coherent design and with several dozen loosely coupled
component each with is own poor and fragmented documentation.

And while it's typically only required to remain up 16-18 hours a day you have
a zero allowance for unplanned downtime and a fairly high peak load which
along with the age and complexity of some of the components make the entire
systems a nightmare to run on a tight budget.

And it's worth nothing that the system that failed was not the Frankenstein
system they inherited from Lloyds but the new one they tried to import from
Spain.

~~~
ethbro
This. Banks can't have extended downtime.

So if a $100M budget would get a new system built, then what they would need
is a $200M budget ($100M to run legacy in prod + $100M to build new system and
gradually migrate).

The only good in-house systems I've ever seen we're (a) based on vendor
reference designs w/ minimal changes, (b) based on OSS, (c) architected by
some very smart people who stay at the company for non-monetary reasons.

Because when you boil it down to it, no company is big enough to solve a
problem better than a group working with multiple companies (unless the
problem is trivial).

Or to word it another way, are you bigger / better / smarter than both of your
top competitors put together? If no, then don't reinvent the wheel.

~~~
Khaine
Banks can't have any downtime. They need to be able to process EFTPOS and
Credit Card Transactions 24x7. Now, not all systems need to be always
available in the bank, but the key ones do.

------
marvin
Banking is particular in the sense that there is often a large number of
legacy systems that need to communicate reliably, at the same time that major
business/technology decisions are made by leaders who do not have technical
understanding. (Even in my native Norway, lauded for good digital banking
services, almost no banks have a single technologist in their executive team
-- the culture is changing, but most places still consider technology a
service that is purchased and mostly separate from business concerns).

In a good case, top leadership will listen to architects and leads on the tech
teams before making critical decisions, but in some cases, "business goals"
will trump concerns from the technologists. This is of course a huge failure
of communication, but it is an even bigger failure of organization. If
technologists have to threaten to quit just to get their point across, the
organization is broken.

Let's say I'm a senior banking developer i Europe. I'm being paid a fixed,
moderate salary with a fixed number of hours each week, and I get no part of
the bonus if this €100 million initiative succeeds, and I suffer no loss if we
incur another €100 million of extra costs due to this failure. If I quit, it's
with 3 months notice and it's a big PITA to find new work that suits my
interests. What incentive do I have to do anything but do my best to alert the
leadership to these problems, and then do my best to move the train of failure
along?

The exact scenario described in this story -- an expensive service contract
terminated at a hard date, with costs to re-instate this contract therefore
becoming even bigger, and the development team being pressed to deliver on
this deadline whether it is realistic or not -- does not surprise me at all,
and must have happened dozens of times all over the world.

If it goes wrong on a small scale, you will only see a few hundred or
thousands of customers affected in a non-catastrophic manner (e.g. see the
wrong balance in their accounts, but with the correct number being accessible
in the back-end system), but it stands to reason that this example would at
some point happen at a spectacular scale with no easy way back.

I'm not holding my breath, but at some point the boards of these banks should
realize that technology is a core competency, and get people with tech skills
in a position to make critical top-level decisions. (Not to mention get pay to
have some semblance of connection with the sums involved in the success or
failure of the work -- I get the impression that this is the case elsewhere in
finance).

~~~
timthorn
> at some point the boards of these banks should realize that technology is a
> core competency

The regulator already has. Some years ago they fined NatWest after an IT issue
because the management processes and technology risk management was not
robust; I'd expect that they'll take a very hard look at this case.

~~~
walshemj
There is a Recent Job Add 3 days ago or a CIO on linked in
[https://www.linkedin.com/jobs/view/633114556/](https://www.linkedin.com/jobs/view/633114556/)

~~~
Macha
More likely a troll? I can't believe a major bank is hiring a C-level on
LinkedIn

~~~
sulam
I think it’s a miscommunication. That looks like a senior role in their CIO
‘group’, not specifically the CIO.

------
wiredfool
"""“This turned what was a super-hard systems job [into] a clusterfuck in the
making,” the insider said"""

Oh dear.

Later:

"""The bank has been forced to cancel all overdraft fees for April and raise
the interest rate it pays on its classic current account in a bid to stop
disillusioned customers taking their business elsewhere."""

I think the main reason their customers aren't taking their business elsewhere
is that their money is stuck till this is resolved.

~~~
tialaramex
The UK's banks are all obliged to offer a system to consumers (and small
businesses now too) where - manually if necessary - the bank transfers that
customer's current account to another bank in a specific time period (one
week? 10 working days? I don't remember). This is a result of government
investigators concluding that banks weren't actually facing much competition
because their customers thought switching to a competitor would be really hard
and so they didn't bother.

Obviously for this to be economical normally, the banks must automate the shit
out of the problem. That means not just transferring the correct balance but
identifying regular payments, notifying payees like employers, and sorting
basically everything out. Anything they don't automate ends up as yet more
work for their customer services agents, because if it goes wrong they have to
pay to fix it. So for TSB right now this is yet another cost they're soaking,
and they don't even get to keep the customers, those customers are gone, no
take-backs.

------
the_mitsuhiko
Is there anyone with a TSB account that is not encountering issues at the
moment? From what one can read online it sounds proper dire for the bank.

~~~
merlish
I managed to log in, but the system is pretty barebones.

Trying to change or apply for a new banking product just takes you to a help
page saying the ability to do this is 'Coming soon'. (Some other features are
scheduled to be available by the 'End of April', for comparison.)

Also, in the 'pending transactions' popdown, e.g. £38.60 is displayed as
'38.6'...

~~~
peoplewindow
That implies the balances are being represented as floats and turned directly
into strings ... how does something that basic happen at all? Is Sabadell's
Spanish web UI like that too? No wonder they're screwed

~~~
jacques_chester
> _That implies the balances are being represented as floats ... how does
> something that basic happen at all?_

Javascript only has floats. For the unwary this is a common source of bugs in
frontends to financial systems.

Hopefully the backend is using some kind of decimal type.

------
collyw
Any insiders at other banks want o comment on the state of their systems? I
would like to know if my banks are at risk.

------
merinowool
I wonder why exactly this has failed. It feels like when using good practices
- especially TDD, this shouldn't have happened. Also wanting to do "big bang
release" is a recipe for failure.

