Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
HSBC moves from 65 relational databases into one global MongoDB database (diginomica.com)
251 points by truth_seeker on June 13, 2020 | hide | past | favorite | 274 comments


This title is misleading, and the article is missing a key piece of information.

This is ONE system at HSBC - out of literally HUNDREDS (maybe thousands) of applications. It's not as if HSBC is moving ALL of its applications to MongoDB. HSBC doesn't have a single tech stack. They have thousands of IT employees all using different tech, in different parts of the world, for different departments. This might be as inconsequential as the system that catalogues security camera feed URLs - or maybe the one that monitors remote employee company mobile data usage - who the F knows, because the article gives us zero information.

Account Management, Credit Cards, Mortgages, Security, Asset Management, HR, Legal, Compliance, Regulatory, Risk Management, Trading, Operations, Building management, Payroll, ATM comms, Clearance/settlement, Website, a million different reporting engines, etc, etc, etc... and then each region usually has its own application for each function - literally hundreds of tech stacks in every tech you can imagine from DB2 Mainframe to Oracle to MongoDB. All banks are like this.

This article is just vague PR and is just referring to one single group consolidating their regional instances. It does not deserve HN attention.


The article is very shallow on technical details indeed. It seems they are trying to promote MongoDB as the solution to their failed attempt of standardizing their application/data model:

> Historically, (...) HSBC did have an application core program environment, which had most of an application's core functionality. But it couldn't have a single programme environment running for all the countries, due to the differences in data models and databases.

> Another benefit is to use the same database for global data analytics and reporting. We don't need to translate into another data model or another database to run the analytics and reporting from that particular data.

So this "application core program" failed to provide an unified solution to their data warehousing needs, and now they believe they can solve this problem by unifying 65 databases into one MongoDB instance, "taking advantage" of its schema-less model.

It looks more like a global storage service than anything else to me.


So, problem: schema does not accommodate enough use cases.

Solution 1: refactor schema

Solution 2: abandon schema

The truth is that you can’t really abandon schema, it’s just spread through the code base now, and any violation of the schema will blow up not when you write malformed data but when you read it later.


Truth.

Anyone know any good tools to manage the evolution in schemas to address this problem?


At the lower level I’ve found ActiveMigration from ruby inspirational. Not using ruby but the idea is solid and I have built a similar system for C# and SQL Server.

You still need something higher level to oversee the schema, but I didn’t look closely - my schema is small enough I can keep track of it in my head.


Flyway is a good tool if you don't want to be tied to a specific language.

The community version (open-source/free) does have a few important limitations that may or may not matter to you.


+1 for flyway. Used it at two companies. Ability to fully reimplement schema from any version to the most recent version is amazing. Also the ability to review a schemas structure at a given git commit is super useful.


I am sure it is quite a complex decision. If I had to guess they have a bunch of document storage all over the place and they are unifying it.

As others have said, you can either validate the schema at WRITE time (RDBMS) or at READ time (NoSQL). Personally I hate the during READ time because then you have all sort of data integrity fun to work through... In either case, I am sure they have a ton of reference documents that they need to store, so easier to manage 1 Mongo than 64 different weird databases.


And if you use Owl you get both?


I initially rebuked thinking that the OP sounds like a disaster, but then I remember from working with other banks that they have an infinite amount of systems from decades of mergers, re-org, etc, all siloed in so many different ways.

So in no way was the OP's article as catastrophic as it initially sounded.


Your point is clear enough, but FYI you're misusing the word "rebuked". ("recoiled"? "reacted"?)


I think I merged two sentences and feelings into one. :)


It’s very common for tech companies to try and paint a pretty growth picture when it comes to banks but you’re right. There are just so many technology stacks and so many databases across the entire organization that this feels light in terms of effort in financial services.


Jepsen's latest analysis of MongoDB -- https://jepsen.io/analyses/mongodb-4.2.6 -- finds that it doesn't even preserve snapshot isolation when set at the highest consistency level. This seems like a pretty terrible decision.

I've never wanted to short a company more in my life.


Data loss is a good thing at HSBC, banker of the criminals and terrorists™️[1], for when the feds come knocking

[1] https://www.theguardian.com/business/2012/dec/11/hsbc-bank-u...


2012 ... they are very strict now. Not saying it cannot happen but at least for normal people it's hard. They check far more than any other bank i'm with and compared to challenger it's almost a black comedy how much you need to fill out, prove, sign etc to open an account. At least in HK hometown. Disclaimer: I have an account there.


Or it is all theatre.


Please stop with the FUD. If you have a substantiated claim, please make it. Otherwise you’re not adding anything to the discussion.


They've been caught laundering money many times over the years, what makes you think they finally changed?


Like I said, could be, but if I wanted to do something not entirely on the up-and-up, I would just open a challenger/neo-bank account in 5 minutes from my home and do it in there. Not somewhere where they have proof of everything I have and do which I need to refresh every year.


That’s a stretch. Carefully-controlled deletion with plausible deniability is a good thing.

Uncontrolled data loss is most assuredly not.


MongoDB: The best DLAAS solution money can launder. (Data Loss as a Service™)


Is HSBC still the bank favored by the CIA for its money laundering needs? I had read that for an FBI agent to look into HSBC was a "career-limiting move", but that was 10 years ago. Maybe CIA have moved to another by now?


> I had read that for an FBI agent to look into HSBC was a "career-limiting move"

Do you have any source that substantiates your assertion?


I was daydreaming earlier on how these mega-big banks can get away with everything (HSBC, Barclays, DB, MS, BAML, etc. etc.) and how nice would it be if all these super complex scheming products that are leverage upon leverage upon leverage would disappear tomorrow..

These mega-big banks and their practices, ranging from LIBOR manipulation, scamming retail banking practices, scamming their own IB/FX customers..). And they get away with it.

As much as I dislike religion (I love the notion of Faith, but I dislike the self-identified 'representatives of gods'), I believe that the most ethical type of Banking is the Islamic Banking (again, not about the religion)(gods that give cancer to little kids to "test their faith" are not my type of gods).


I got a credit card with them a few years back and immediately regretted it. Their tech (website, security, etc) seems okay at first glance, but almost immediately it turned into agony. Even just logging in is a chore, and often doesn't work. Their two-factor implementation is a mess.

In short order, I cut the card up. Still get sad emails (which they provide no way to stop) begging me to use it.

Also, interviewed at MongoDB once. Couldn't shake the feeling of a cult, where everyone was careful not to speak of the product's serious limitations.


Haha I feel this. Did you also have the credit card sized token generator with a keypad? How convoluted was that login process? Agony is the correct term!!


I had forgotten about this but the dumb 2FA generator is the reason I switched away from HSBC for Citibank (UK) many years ago!


> Even just logging in is a chore, and often doesn't work. Their two-factor implementation is a mess.

My parents switched from Lloyds (UK) to HSBC and are looking to switch again.

You can't have multiple devices logged in. I.e. you can log in just fine on your "security key" device (say a phone), but if you want to login on your iPad to check something, you have to get a code from the app on your phone. Absolutely pointless.


I also have an HSBC credit card. I am unable to pay my card balance on the app, I have to log in to the website and get redirected to a separate web app to make payments on it. I’m keeping the account open and paying a single monthly recurring bill so I only have to do this once a month. All my other credit card balances can be paid down through their corresponding app.


Do they offer auto pay?


I dunno. I am banking with them (also other banks) and they are pretty good - most of the other banks face similar problems around the same areas, which I ascribe to very strict regulation.


Sounds like a marriage made in heaven between these two.


Cutting up the credit card is good, but you may want to officially close the account.


Considered it, though by the strange rules of the US credit system, it would harm my credit score to do so.


Shorting a banking company because they chose to use MongoDB... you can tell that an engineer wrote this comment. You're missing the forest for one particular tree.


I thought the parent was talking about shorting MongoDB ("MDB" NASDAQ ticker)


I agree, though decisions made is isolation can give some peeks into company culture. After all, forests often have many of the same kinds of trees.


And some forests have a few carefully arranged trees of a different type.

https://en.wikipedia.org/wiki/Forest_swastika


Eh it sounds snarky but not a bad idea because the downtime seems inevitable when using MongoDB.

I'm not informed enough to know if it'll affect the share price for a bank but if a tech company did onboard mongo that'd be a major red flag for me.


Why expected down time for dB that has built in replication?

I would expect downtime from RDBMSS and most banks have scheduled downtime nearly every weekend they say when I log in.


Yep, also we always remember the markets can stay irrational longer than we can stay solvent or something to that effect.


Transactions in bank accounts are actually not realized through DB transactions. It's easily visible if you ever had a failing payment: the amount gets subtracted but added again right away after that. They have pretty strong auditing requirements.


What has "amount gets subtracted but added again right away after that" have to do that with DB transactions?

Basically each Banking transaction is an immutable log, which can easily reside in a DB with DB transactions feature. They are not related.

Can you please elaborate in technical terms what you mean?


Not the OP but I think it's essentially a event driven system as opposed to real time.

Think of a account with $0 in it.

1. A -$25 event comes in

2. A +$50 event comes in

3. A -$25 event comes in

The first event would fail in a transactional/real time system, but in my bank it doesn't. This tells me the balance is only calculated every once in a while. The balance you get in a app is only a estimate.

This makes sense when you think how slow bank transfers and using paper checks is. At some point it is decided to run the events "for real" and that's where overdrafts and the like are applied.

Don't think of them as transactions, instead it's a transaction request that can be rejected.


Related to this — Wells Fargo was sued[1] (and lost) around 2010 for the practice of ordering daily transaction to (indirectly) maximize fees and penalties.

In your example, a “good” bank would order the transactions: +$50, -$25, -$25 resulting in a zero balance with no penalties. What Wells Fargo did could result in the transactions being ordered as: -$25 (overdraft), -$25 (overdraft), +$50.

Each overdraft could charge a fee up to $35, leaving the account holder with a -$70 balance.

[1] https://www.latimes.com/nation/la-fi-court-bank-overdraft-fe...


Seems much simpler only to charge fees at the end of the day...


Unfortunately, the optimization goal for commercial entities generally is maximum money extraction, not simplicity :)


It depends on the country and currency but there is the SWIFT system, which works like messaging. There is also the SIC and EuroSIC. For SWIFT system as a bank you have an account on the other bank and sends an instruction to decrease your amount and book it to a receiver. If there are more than one bank between them it gets routed but always "own decrease and other bank increase". Only the receiver bank books to the client. This system works because every bank has an account on other banks and you can build up some paths.


Sure, it just means that the processing is done in a batch job at the end of the day. However, it doesn't mean that the batch jobs don't use a database. There's probably something like db2 on the mainframe, possibly even COBOL or RPG


Yep, it's pretty clear they use IBM db2 on z/OS internally for some applications. Here's an example job posting: https://webcache.googleusercontent.com/search?q=cache:Aqb9oU...


How slow bank transfers are? Is 2 seconds really that slow?


In the US they clear overnight, as I understand (from a previous hn post I can no longer find) it's a big sftp server and flat files.


Working in the fintech has opened my eyes to how many shortcomings there really are in banking/ACH. What you describe above is both my nightmare, and my reality. Pipe-delimited flat files being SFTPed via cronjobs overnight. Getting the emails at 5am that balances are off/not reconciled because somehow an errant '|' got into the mix.

Additionally, the number of folks that ask for spreadsheets (not encrypted) with your SSN... and then proceed to email that data around (sometimes via Gmail!)


The article is about HSBC, a British bank, where the transfers happen in a matter of seconds. What is the point of comparing it with US banks?


I could be wrong but I think the following scenario is worth considering. Let's say a client wants to transact money from bank A to bank B. The following steps are taken:

1. Client initiates the transfer request.

2. Bank A decreases the client's balance.

3. Bank A sends the request to transfer money to bank B.

Now, I guess you would expect that steps 2 and 3 should form part of a DB transaction. This however does not really work because it is unclear what conclusion we should draw from step 3 failing. On one hand, it is possible that the request failed on its way to bank B, in which case step 2 should not be applied. On the other hand, it is possible that the request successfully reached bank B but the response got lost on its way back. In this case step 2 should be applied.

Therefore, not treating steps 2 and 3 as a single transaction makes sense if you want to take a conservative approach that does not allow for accidental transfer of infinite amounts of money to bank B in scenarios when the response fails to come back from an otherwise successful transfer.


We have exactly the same problem at work. One possible way to solve it is by ensuring that the receiving system treats the request as idempotent. One could also poll first at the receiver whether the transaction exists, and then send it if it doesn't. But both measures require that there is a way to uniquely identify a request. Also, idempotency can be annoying to implement correctly. Thus, implementors might try to wiggle their way out of it, technically or politically.


This is the only comment worth a thing here. Unless mongodb causes irrevocable data loss, eventual consistency is easily mitigated using an accounting service. I have seen one of them at work. Transaction amount comes in, it is compared with initial value and the session is kept open until the quantity is fulfilled. If there is a timeout, secondary systems try to recover or alert. Recovery can even deduct overfills or complete fills.


This seems ridiculously complicated and unnecessary for something that should be as easy as having a proper RDMS that is ACID compliant. I guess now we have one additional reason why banking has stagnated for so long.


Just in case you (or someone else) don't know why you are being downvoted, I'll give you a few reasons:

- Banks have to put these double-checking and reporting systems in place because the law mandates them to. No amount of ACID compliance in your database is freeing you from that.

- These requirements are not stupid either: assuming you had a perfect system with perfect ACID compliance and 0 bugs ever (which we all know is impossible), a random hardware failure (or cosmic ray) can flip a bit somewhere and you are screwed.

- Your comment reads as extremely condescending, which is never good to get your point across.

- Not only that, but you dismissed the work of some of the best people money can buy in two sentences with 0 rationale. Next time you think you know better than an entire extremely well funded industry, think again, and again, and again. If you are still convinced you know better, please open shop because if you are right you'll not just make yourself the next self-made billionaire, but also probably improve our lives in the process.


This comment gives me hope that there are at least some decent people on Hacker News.

Every time I read an article here I am put off from participating further because the comments are usually filled with toxic spew from peacock fan boys Who think they know better.

You have my up vote.


It is complicated, but necessary. You will get to the ridiculous part once you realize that executing brokers don't care about your acid guarantees or whatever.


Thanks for the info - I didn't realize we were discussing distributed transactions with external brokers/banks.


Banking is integrations with literally hundreds different external systems.


Are we discussing HSBC bankers or Silk Road LSD dealers?


I suspect ACID doesn't help either of them.


Seems like the guys at HSBC tech department are still doing the whole fashion over merit.


What's wild about this is that Mongo isn't even in fashion anymore! NoSQL has been supplanted by NewSQL, and relational dbs are in season again. Even if they were trying to follow the trend, they'd be a half-decade behind.


I work at another big corp in London and mongo has been doing a major sales push on us in the last 18 months (which wasn't pleasant to push back on).

I'm fairly sure that this was driven by a similar sales push.

Mongo has always been a sales and marketing company first and technology company second.

I'd love to see a tear down of their sales and marketing strategy coz its clearly top notch.


The strategy is called “Lie, Overwhelm, and Raise Rates on Renewal”


I've never heard the term "NewSQL" before.

Google Trends comparing "NewSQL" with "NoSQL" https://trends.google.com/trends/explore?date=all&geo=US&q=N... (Peaking with its introduction in 2011).

It doesn't seem like a more fashionable term than NoSQL.


Interesting, what is considered newsql? Postgres?


I am not a fan of this buzzword (and indeed buzzwords in general, the comment above was intending to make fun of HSBC IT for even failing at the buzzword game), but the notion of NewSQL was roughly to keep the relational model and consistency guarantees of traditional SQL databases, but also have horizontal scalability in line with NoSQL databases. DBs like VoltDB and Spanner come to mind as examples -- here are two relevant blog posts from those teams:

VoltDB: https://www.voltdb.com/blog/2016/09/nosql-vs-newsql-whats-di...

Spanner: https://cloud.google.com/blog/products/gcp/from-nosql-to-new...


No."Horizontally-scalable" SQL like CockroachDB, TiDB, Google's Spanner or YugabyteDB are examples of NewSQL, Postgres would be "OldSQL" if you will.


PostgreSQL still the best sql though :-)


SQLite is the hottest new thing on the block.


Huh? SQLite is nearly 20 years old, and is only intended for small scale use.


It was intended as a joke in context, but I believe SQLite is far more capable than most will give it credit for. This is especially true for applications which you only ever intend to run on a single logical machine. But, even for those applications requiring multiple nodes, it can still be a powerful tool if you assume clustering & replication duty in your application logic.

For any use case, SQLite is one of the best ways to persist structured data to local disk.


I try to use sqlite for all projects until i cant.

Its a little scary how much it can actually handle, and how lazy and sub optimal it is to reach for a "real" sql server before you need it.

Having the database as a simple file right next to yoyr application is incredibly convenient not to mention how brain dead things like backups are to grok.


SQLite is incredible, especially for embedded applications. It's one of those perfectly-scoped tiny pieces of software that consistently makes my heart sing.


Oh, I agree about its utility, and use it myself. It seems like a different category of tool than the other DBs under discussion, though.


Tidb, cockroachdb maybe?


While that is true, shoving that logic in to the database itself tends to indicate an anti pattern in the software that uses the database.

It used to be lots of fun and highly optimal to write all sorts of procedures and triggers because it was the only fast and reliable way to do things, especially when your in-database-code was essential the only gatekeeper against a multiple owner database schema.

The idea that you have one database schema that you are going to share with different applications seems bonkers to me at this point in time. Compute and storage resources are far cheaper than development, DBA and potential stagnation that a schema that doesn't have a single owner brings.


From the article, it looks like they are replacing separate DBs used for each country.

"Local requirements for each country will be built into the application, but there's no need to maintain separate data models or separate databases anymore. We could easily design the global data model and database using the MongoDB JSON schema model. That brings data from all operating countries into one database and the application can run on just one database. Which is a lot of reduction in resource and maintenance cost. "

Are there any other schema-less databases out there that is better than MongoDB? If not I believe it's time for someone to build it.


> Are there any other schema-less databases out there that is better than MongoDB? If not I believe it's time for someone to build it.

Yes, PostgreSQL supports JSON and won't lose your data: https://www.postgresql.org/docs/current/datatype-json.html

Benchmarks:

* https://portavita.github.io/2018-10-31-blog_A_JSON_use_case_...

* https://www.postgresql.eu/events/fosdem2018/sessions/session...


From the second link, summary:

Summary - PostgreSQL ● PostgreSQL has poor performance out of the box ○ Requires a decent amount of tuning to get good performance out of it ● Does not scale well with large number of connections ○ pgBouncer is a must ● Combines ACID compliance with schemaless JSON ● Queries not really intuitive

Summary - MongoDB ● MongoDB has decent performance out of the box. ● Unstable throughput and latency ● Scale well with large number of connections ● Strong horizontal scalability ● Throughput bug is annoying ● MongoDB rolling upgrades are ridiculously easy ● Developer friendly - easy to use!


> PostgreSQL has poor performance out of the box

This is actually a valid point. They should start distributing typical configs for typical AWS-alike machines, current default config is for some underpowered machine from 90s..


> Queries not really intuitive

ISHYGDDT


I would argue that MongoDB queries aren't intuitive but I've known SQL for at least 10 years


For MongoDB in Python I once had to construct a list of two dictionaries containing dictionaries themselves with various two and three character magical key and value names (eg. $gt) in order to query a collection by date. If that's intuitive I'd hate to use something unintuitive.

Oh and MongoDB Compass is a dumpster fire. Query takes too long? Too bad it will time out with no option to let it complete. Also I get to write my query in JSON in compass then I have to convert that to native Python objects if I want to use it from there. With SQL I copy my query from datagrip, inject my parameters and call it a day.

I forgot the best bit, if your query has a subtitle mistake most often you simply get no records back where a similar mistake in SQL throws a helpful exception.


really? best of luck with that. single node, and no faiures - and it still messed its pants: http://jepsen.io/analyses/postgresql-12.3


MSFT is trying with Cosmos.


And the question you should ask is, does HSBC actually need to preserve snapshot isolation when set at the highest consistency level? If the answer is no, then your comment is irrelevant.


Based on the headline I thought the same. And just wait until the shady audit and renewal practices begin. In reality the total cost may be small for a bank that big. But messing even one trade or large account can cost them billions.


or you could just use postgres to corrupt your data. latest jepsen analysis http://jepsen.io/analyses/postgresql-12.3


Have you ever heard of Muddy Waters? The famous stock shorter?


There are Jepsen report and people that know how to read them right? Fortnite runs on MongoDB thay have 130M players worlwide and do $B a year without issues using it, so you think it was a mistake for them?


“X uses Y therefore Y is good” in general is a terrible argument. But also, Fortnite is a game. It’s for fun. 99% of their read/write throughput is in the context of a match, and matches are ephemeral.


> “X uses Y therefore Y is good” in general is a terrible argument

Which is not his argument. His argument is "X uses Y, and X doesn't have problems with Y, so Y is good enough for X at least". Which is a completely valid thing to say.


This is like one step above csv files on an NFS share.

Could be a great way to wash some crimes off the books, "oh our DB failed and our backups were stored in the database as well, oopsie".

In less than 5 years there will be an EU directive on how financial institutions need to have immutable db architectures with full provenance.


In other words, it will be legally required for all banks to use Datomic.


What and make them accountable? Not a chance.


It's the EU. That suggestion totally sounds like something they'd do.


> In less than 5 years there will be an EU directive on how financial institutions need to have immutable db architectures with full provenance.

I'm a former Entreprise Architect from Banking Sector in Europe.

To be clear , this will never happen.

There has never been any directive that has had a "concrete" impact on IT Architecture of Financial Institutions.

Yet there has been dozens of regulations that urged banks to "simplify" their IT Systems.

90% of IT Staff are Baby Boomers with no background in Systems Design or Architecture , thus when new regulation comes in it follow this scenario 100% of the time :

- Find a vendor that sell a software that promise compliance with new regulation

- Find an integrator that promise integrations within the deadline

- Integrate the vendors software in Banks legacy stack of 3000+ monolithic apps

- Send report to regulator saying they have "redesigned" their architecture and made "investment" in order to take in account and that new regulation

Best examples of this is "PSD2" which has been the biggest fiasco of the industry , has of the last year only 18% of banks complied with the regulation.

France and UK said they would not "fine" anyone, because banks "aren't ready" and made "considerable" investment in it.

Regardless of what HN thinks , you won't solve Financial Institutions Multi Decade Legacy with a single Directive that would suddenly force them to use "ImMuTABLe Db ARChITectuRES"

It Directives have never worked and will never work.

The only way to enforce anything would be to remove their IT systems completely and have them use APIs provided by the regulators , and have the regulator become the sole provider of "Financial System".

They wont let that happen.


I work for a major US bank. Regulators frequently (multiple times a year across different areas of the business) have detailed discussions with us about specific technology choices, especially regarding issues that could affect data integrity or disaster recovery.

I have not seen regulators require a specific technology, but I have certainly seen them questioning technology choices. There may be some truth to echopom's claim (made regarding European regulators, nott US regulators) that regulators can be bamboozled with meaningless claims to have "redesigned" a system and "invested" in it... I couldn't say because both of the US banks I have worked for have taken even gentle hints from regulators EXTREMELY seriously and would not have attempted to bamboozle them.


I work for a large financial institution in the US and this pattern (which you've brilliantly described) seems to be slowly shifting. We're buying less shit and building more. Developer experience is still absolutely horrific but there are material efforts underway to improve it that are starting to pay off.

I don't know that we'll ever get to 'move fast and break things', but I think that's OK.


Some UK banks (at least Monzo, Bo, Mettle and Starling) have fairly sound data models for their ledgers and implement stuff like PSD2 fairly effectively. Presumably over time legacy banks will either catch up or incur the regulator's ire


Are you aware of any of these banks having published anything insightful about their data models and relevant tech?

(I'm not trying to claim you don't somehow know this, I'm genuinely interested to read about this stuff!)


I have worked in a German bank and the bundesbank does regulate on very high-level decisions. f.e., systems which do financial crime monitoring need to be compliant with their rules. Access to pre-production should be limited.


>Access to pre-production should be limited.

Banks have extremely strict rules when it comes to system "access".

When it comes to "System Design" they have very little.

This are two separates topic.


> Banks have extremely strict rules when it comes to system "access".

Isn't that's because of SOX compliance requirements?

So, the point above is that "a new set of requirements" could be added regarding "data store software integrity" (though probably named better) if it turns out to be needed. :)


> Isn't that's because of SOX compliance requirements?

Bank always had very strict rules , SOX just force them to formalize those rules/process.

Per say , prior to SOX they would not conceal the review of the logs , now with SOX they will will edit a PDF that say "We have review logs for Apps X and consider no suspicious activity had occur".

Apart from that , things didn't change much.


Surely they IT depts won't be 90% baby boomers forever. At some point even the most rusted-in-place employee has to leave the company, either through retirement or because they died of old age. Slowly but surely even late followers like banks will move into the 21st century. Banks are not very keen on spending money on things that "aren't broken" but they're even less keen on losing customers and paying fines.


I have to say that on the surface, this sounds horrific.

- MongoDB: Yes

- Micro Services: Yep

- Some attempt at a grand unifying model: Of course

That's pretty much a 10/10 shitshow as far as I am concerned. Mix in banking regs & regional differences for afterburner on that money furnace.

But, in reality, I doubt that one of those 65 relational databases involves the core customer/account/transaction processing facilities. These are probably (hopefully) still running on an IBM system with proper ACID & uptime guarantees. Anything a live transaction flow (especially credit/debit processing) is going to touch could never be trusted on something like MongoDB in this current reality.


My fear is that those IBM systems running Db2 or something were counted among the 65. While it seems obvious to us that they're better off without switching those, I know of several companies in financial services that should know better that trust MongoDB for their core transaction flow.


The 65 relational databases are for the same application or similar deployed in 65 countries...At least I read it in that way.


They didn't switch anything of consequence. A bank the size of HSBC has thousands of application databases. They never migrate their core transaction systems, they just build piles of shit around them ever higher.


I've worked contracts in UK banks for a number of years now. Nothing on the mainframes but I know some of the guys that do work on those and, despite the constant stream of flavour-of-the-month tech that we talk about constantly in places like HN, the mainframes work well and will not be replaced any time soon.

The dev cycle for releases and whatnot is incredibly long due to testing and, lets be honest, fear, in case something breaks but when it comes to battle-tested technology, mainframes and DB2 are right up there.

I do know of some code that's over 40 years old and still runs at the core of one of the big banks...

MongoDB, despite its capabilities, is... an odd choice imo! Not trying to second-guess but if I was in that meeting I would have certainly raised an eyebrow.

Good luck to them.


Really depends what apps the databases are supporting. Big banks have thousands of internal apps (as opposed to client facing services) to support workflows, business analysis and management reporting. Might be a fine choice for some of that stuff.


The problem with 40y old code, is where I work we had to pull engineers from retirement to make a regulatory change. In 10y, I am not sure we will have that option.


I’d wager it would cost a whole lot less to train devs up on the existing system than to rewrite. Without understanding the existing code, you’re going to have a tough time replacing it.


I’ve read that in some cases the initial requirements and source code was lost, so making changes to the existing software becomes a reverse engineering problem in addition to a “language proficiency” problem.


This seems like a hiring failure to me, not a tech failure. I've been hired on several occasions to work with a language I had zero prior exposure to, and while there is of course some ramp-up time I managed just fine and eventually became skilled and knowledgable in those languages.

I don't see why COBOL, mainframes, or whatnot would be any different. Perhaps the ramp-up time would be a bit higher, but most skilled programmers will manage. I actually wouldn't really mind working with any of that – I just never had the opportunity.


I work for a large consultancy - we have large teams dedicated to COBOL, and I believe others (e.g. CapGemini, TCS) do too.

I wouldn't claim they are any good, just mentioning that grads are being trained in COBOL.


My prediction is something will go horribly wrong, then they will go "see, we tried modern tech and it fucked us over", and they will commit to seeing their code and tech stack turn a century old before considering any updates.


I read the article and couldn’t figure out what they were actually putting in mongo. If it’s their core processing/ledger then that would be very surprising. But banks have dozens of different systems, some of them would actually almost be suited to Mongo, and some of them have rather loose SLAs. Unless I missed something, the article didn’t really make that clear.


There's another thread on the homepage right now about Crux [0], which is a database designed specifically to be able to operate downstream from multiple legacy systems using bitemporal indexes to stitch everything together. This pattern is very common in banks e.g. for calculating and reporting on risk positions.

I expect you're right that this story is purely about using Mongo for downstream systems.

[0] https://news.ycombinator.com/item?id=23493163


That thought did occur to me as well but, in taking the article at face value it's most certainly some core aspect of their banking systems at the very least...

If they are intent on replacing the mainframe then, imo, MongoDB would be a bad choice but who knows... the new, challenger banks don't use mainframes.


Any ideas what some of the challenger banks are using? I've not seen any examples beyond Nubank (~$10B valuation), who have bet big with Datomic: https://www.youtube.com/watch?v=VYuToviSx5Q

Also: https://www.datomic.com/nubanks-story.html


Not sure exactly but any job ads I've seen for the likes of Monzo and Starling talk about backend engineers that have a deep knowledge of CI/CD and you don't do that on a mainframe.

Also, Monzo are heavy users of Kubernetes, so again, no mainframe. [0][1]

Starling use AWS and GCP [2]

In addition, the cost of entry of a mainframe system is a high bar when you can spin up a server on Azure in seconds and deploy code to it in a handful more seconds and then scale it to the heavens in yet just a few more seconds... all for less than the hourly cost of the IBM salesman.

I've yet to see any jobs that mention mainframes, Cobol or the like.

[0] - https://monzo.com/blog/2016/09/19/building-a-modern-bank-bac...

[1] - https://www.8bitmen.com/an-insight-into-the-backend-infrastr...

[2] - https://blog.container-solutions.com/starling-how-to-build-a...


> Not sure exactly but any job ads I've seen for the likes of Monzo and Starling talk about backend engineers that have a deep knowledge of CI/CD and you don't do that on a mainframe.

The part of the bank that handles core processing is going to be a very small portion of their engineering team. Kube and DevOps ads don’t provide any hints about how much they might rely on DB2, COBOL, etc...


Oracle has a banking platform that they mostly only succeed in selling to smaller banks. I don't know if that would count as "challenger" though.


The article gave me the impression it might have been for some sort of CRM-like system. I don't know if it would be even possible to replace DB2 with Mongo, the consistency guarantees are too weak. Banks are accountable to the central bank in every jurisdiction in which they operate, I wouldn't see them getting that past one single central bank, let alone 60+ of them.


If you remove the specific technical choice from mind (a great engineering team can move mountains with any data layer, something I've also learnt working in numerous contract teams at multi-billion dollar companies) - the simple act of rewriting and consolidating huge amounts of data/infra (and stacks) can yield enormous gains.

I will be tracking their technical reporting more closely over the next year.


"I will be tracking their technical reporting more closely over the next year."

Just curious, but what specifically do you mean? ie. I'd love to keep an eye on this also, but have no idea where to find these types of (public) reports.


> ... the simple act of rewriting and consolidating huge amounts of data/infra (and stacks) can yield enormous gains.

What's a good example of this working out well?


I can also suspect they did it on purpose to loose some data by accident. What could possibly go wrong.


I have no actual information about this migration but I wouldn’t be surprised to learn that the databases that are being consolidated are for some minor role or are separate instances with duplicate data for analytics or whatever. It sounds wildly unlikely that they are going to smush together all their customer accounts in one MongoDB instance. As well as being a huge technic challenge I’m sure regulators would have something to say about this.


TBH I think so, too. I guess this is just a glorified global network drive to put all the different, country-specific documents like Excel sheets, Word docs, PDFs and some such.

https://youtu.be/GI5kwSap9Ug?t=63


This is very likely. Most banks have stupid-huge quantities of products and services, each with extremely elaborate requirements and data storage needs. “65 relational databases” might well only represent a few percentage points of HSBC’s data footprint.


Sounds like a win-win for them. They get to market themselves as staying up to date with tech (although the HSBC personal banking app I use indicates otherwise). And when they "lose track of" the money they move for the cartel and terrorists, they can blame MongoDB and spare some execs the jail time.


I'm glad someone mentioned cartels and terrorists


I was there in Aug 2019 and I did hear them talking about mongo, There was a programme at the time to "refactor" their estate, a part of that they wanted to replace bunch of ad-hoc databases with clunky- or no data governance with one (or fewer) central places, which will be under Data IT control. The rationale was to reduce risk of data loss, leak or general business continuity issue in what was 6-12 month project.

The cost of architecting relational model that would be cover for 65 mini databases in a bank will be astronomical. It is far easier to setup no-SQL entry point that would auto-conform for whatever requirements upstream applications might have.

It is a technical win for Mongo, but I don't believe this is an attempt by HSBC to get up to speed with database trends. HSBC are removing internal audit strikes, nothing more than that.


"auto-conform" is a bit of a rich phrase, up there with "schema-less" and "my data doesn't fit in tables"

Not to sound pedantic or old or curmudgeonly (because while I'm 45, I was already curmudgeonly and reading Fabian Pascal when I was 28). But c'mon.

All data has a schema, it just takes effort to discover it. A relational database is a set of facts about the world. Lack of effort in finding that schema and normalizing data, "conforming" it... it means effort from very frustrated people who have to clean up the mess later and make it conform.

FWIW SQL can seem like a clunky and dated way to talk about data. It is. But it's not the relational model's fault, it's SQLs's. Going "NoSQL" simply makes data modeling problems worse. It pushes the problem til later, and pushes reasoning about the data into places where it shouldn't have to be. Even more so when the underyling data storage tech is dubious, like Mongo appears to be.


>The cost of architecting relational model that would be cover for 65 mini databases in a bank will be astronomical.

But they already have relational models, right? Can't you port the DDL over to centrally managed instances?


We are working with HSBC and have not heard about Mongo being a thing despite working on data with them. What location and department were you in if you don't mind saying?


This was in London HSBC HQ, Equities


Fascinating. Are they intending to use this for core transactional workloads? How do they intend to deal with Mongo's (many) known consistency issues?


I have not worked on this project, so I don't have details. Mongo is in a nutshell to replace a database trader setup under their desk and that grew quite important. There is nothing pretty about that. I'm quite sure they will have issues and workarounds for years to come :(


Incredible -- was this under-the-desk database fairly recent or quite a while ago? I'd love to hear the story if you or someone else familiar with it is willing to share.


There’s been a lot of talk about the Jepsen-Mongo affair on HN lately, but the problem I have with Mongo and NoSQL in general is much more basic - it’s modeling relationships, which always end up appearing in every data model I’ve ever designed.

I’m aware you can keep references from one doc to another but it always ends up being a messy affair even at smallish scales, and it feels as if the cognitive burden of managing these relationships ends up falling on the dev, instead of being managed by the DB.

For those of you using MongoDB in large scale large production apps, is this not a problem? Is your underlying business domain really non-relational, or do you manage to comfortably run highly relational models on a doc-based DB? How?


Mongodb now supports effectivly "Joins" in its aggregation framework, so it can do some relational style data representation. The latest versions also support transactions.


Yeah but they do not recommend using that. I think the one nosql afficianado in my team spoke with mongo support and they recommended reassessing our document structure for it. (So a migration after all... )

Also in their blogpost [1], about $lookup they say:

We’re still concerned that $lookup can be misused to treat MongoDB like a relational database. But instead of limiting its availability, we’re going to help developers know when its use is appropriate, and when it’s an anti-pattern. In the coming months, we will go beyond the existing documentation to provide clear, strong guidance in this area.

[1] https://www.mongodb.com/blog/post/revisiting-usdlookup


Everything about this sounds mad. I'd have put referential integrity high on a bank's list of wants.

Clearly they have some expertise, I'd love a more in depth look at how they arrived at the decision.


These are definitely not source of truth customer transactions. Likely auxiliary databases to power various experiences. Like, a customer walks into the bank and the agent needs to know what services to upsell. Or the customer logs into the website and is shown their current portfolio value.


That kind of stuff could have been implemented with any large CRM system(Salesforce,Dynamics,SAP) without any bigger issues.


a) Salesforce is usually out of the question due it not being on-premise.

b) Dynamics and SAP are both ridiculously expensive especially since you would be need to also buy additional software e.g. Windows to run it on. Plus of course it's much harder to find specialist engineering talent who know these products.

c) All three are far too slow both on the ingestion side e.g. millions of mutations a second and on the read side e.g. feeding into a real-time ML model for an upsell prediction. MongoDB is specifically designed to cater to both of these technical requirements since you can mutate sub-documents and retrieve the full document extremely quickly.


a) I'm not suggesting this for banking transactions- that'd be crazy. However customer service, interactions whilst at a branch, handling all sorts of loyalty programs and etc. To my knowledge, Barclays in the UK use Salesforce for some of these. b) Don't know SAP licensing model but I can't see a major bank not being able to afford $100/month per user for Dynamics/Salesforce. c) They do some offering for these kind of things,but I can't comment on how itd work on such a scale.


How do you know this?


Because no bank in the universe is storing its customer transactions in a bag of 65 relational DBs. It’s all Tandem and IBMz and DB2.


Very rarely do headlines make me physically cringe but here we are.


So you have no idea about the business and non-functional requirements or the architectural or engineering considerations that went into this decision. But yet you believe it's such a bad decision that you physically cringe.

Classic HN.


Did you ever actually read mongo db code? I had to, and only the drivers, but when I learned that the C++ driver would just nope out of town in some cases and literally exit() the application, I made the decision to get rid of that stuff ASAP. There are other instances, pymongo cannot be forked, for instance. Good luck using it inside celery. The connection string parsing is another matter. All in all their driver's code looks good at the surface, everything documented etc. But the comments are more or less superfluous and the design is a mess.


I think there's a bit more to it than that, given Mongo DB's history.


I'm not in the banking industry but this sounds like the type of industry where you really need strong DB schemas in place and not something mongo-like ...


I know things about the banking industry and let me tell you, just downgrade your expectations coming from the tech industry. Then downgrade them again, just to be sure.


And then upgrade your expectations of uptime and error tolerance by a few orders of magnitude.


Banks have notoriously terrible uptime! See, for example, the wide range of multi-hour to multi-week outages suffered by many UK banks in the last few years.

Most of the legacy retail banks regularly take down their online banking for hours at a time to do 'planned maintenance'. I've also heard that this happens to things like payment gateway APIs that aren't directly visible to consumers. Sometimes a Faster Payment might take a suspiciously long time and that's because the API fell over or was down for maintenance.

At least the data integrity is pretty good.


Didn’t realise the UK was so terrible. Banks here in Australia have many flaws but downtime is quite rare.


That hasn't really been the case the past few years, in many places. (Especially the UK.)


Mongodbs newer versions support schema validation by allowing you to register a JSON-Schema description against a collection and validate writes against It.


Fantastic. Knowing that my data was validated before being lost makes it so much better.


But the database doesn't need to be the component enforcing those schemas, right? They could validate the schema of a document on storage and retrieval. And keeping the schema version within a document would allow for multiple schema versions to exist within a single collection, which then would allow lazy migrations.

I'm not saying I'd choose Mongo specifically for this job (if for anything), but the lack of DB enforced schema might not be the biggest problem here.


This is the problem I’m alluding to in another comment about NoSQL in this thread. Why would you want to enforce the schema in code when the DB could do it for you? It becomes such a mess and such a burden for the developer!


That’s not always true. Often it’s not true. There’s a reason why we don’t have file systems that guarantee all .jpg files are valid JPEG files.


Now I want a filesystem that guarantees all .jpg files are valid JPEGs.


Until you need to do an emergency transfer of some recovered data volume and there’s a few dozen slightly corrupted JPEGs scattered throughout a few million files.

Also, I don’t think I’d want my filesystem deciding what qualifies as a valid JPEG.

But yes, I can see the theoretical appeal.


That means either only one place accesses the database (and becomes a bottleneck for change) or everywhere that accesses it has to keep their schemas in line.

Neither of those sound like great situations to be in.


I’m a big proponent of schema-rich SQL but I don’t see the argument here. Enforced schemas can be a bottleneck for change no matter where it’s enforced.

And just because some parts of a schema need to be “in line” doesn’t mean all of it must be. Maybe I care that “Customer ID” is enforced, but not “Australian Tax File Number” or “Document revision annotations”.


A lot of what banks do is filing forms. Thousands of ever-changing forms from millions of sources. Even things that aren’t forms are still very document-like: transcriptions, attachments, receipts, invoices, statements, letters... all of it more suited to document storage rather than tables.


No, you're clearly not in the banking industry because your assumption is completely backwards of reality.


After the latest Jepsen analysis of Mongo I'm surprised anyone would use it for anything that serious.


I can't speak for Jespen. But I am using it from last several years and it never ditched me even at the transaction data flow rate of 20K per second.

Database alone isn't the deciding factor, how you used it in your application architecture matters the most. No database is perfect but some are more flexible than others.


My comment was about failed transactions.

Are you using ACID transactions?


Yep. MongoDB 4.2

In fact we replaced Kafka with Change Streams( little over 7k messages per second) and ES with text search (2K queries per second) in MongoDB on 16 CPU core nodes in our cluster.

I don't believe in any benchmark unless it matches the business use case and development practices I use at work. I recommend the same to others.


HSBC Commerical might have just asked everyone to put their data into a mongodb data store that nobody ever uses, which would make the headline technically true but kind of irrelevant. One thing you can be sure of is that they didn't stop using 65 relational databases.


whenever I hear of mongodb - I think of that "but it's web scale" video.


Haha yeah me too :)

This is the video if anyone is wondering:

https://www.youtube.com/watch?v=b2F-DItXtZs


Building in plausible deniability so they can resume their HSBC money laundering services with impunity?

Geniuses!

Edit: In case you missed the story https://www.bbc.co.uk/news/business-18880269


"is looking to" makes this sound aspirational. The absence of operational lessons learned that would come out of a deployment make me wonder if this actually happened.


Thank you. The headline reads "HSBC moves..." which in normal news parlance means that they've recently completed the move or the move is in progress.


Sounds like the opening scene of a horror movie.


A fortune 100 financial institution I once worked for had an initiative to migrate their extremely large, fairly well normalized DB2 database to schema-less dynamoDB.

All new tables had to go into dynamo, and if your service begins consuming an old DB2 table, it had the job of migrating that data to dynamo.

The end result of this was applications that half pointed to DB2, half dynamo, confusing new data with old, and tons of bugs relating to losing entries into dynamo tables under outdated keys, not being able to clearly match up customer data, and myriad problems that were never a consideration using a relational database. Needless to say, I pulled all of my assets out of this institution as soon as I had the chance.


I've spent decades in enterprise space and unfortunately developers often don't have a clue how actually things work.

Companies like IBM specialise in royally screwing companies through their licensing deals and with their legal enforcement teams. They are nothing like dealing with a normal startup vendor or with someone like AWS. They play hard ball.

So sure you can complain that they should've just stayed with DB2. But usually that means paying tens of millions which then increase each year since they know you won't migrate.

That's why companies take drastic decisions and move to the cloud even if it's technically not the best option.


But drastic can be going to Postgres/Mysql instead of db2/oracle; Dynamo (by the sounds of the case here) is more on the insane scale than drastic.


Oh my god I can only imagine how they provisioned their DynamoDB throughput. Setting a dumpster full of cash on fire would have been more expedient.


The dynamoDB bill ran over $25,000 a month, just for their dev environment. Granted, they had thousands of developers, but still.


The article makes this sound way bigger than it is. HSBC has thousands of applications and probably tens of thousands of databases. Migrating one app to use a different DB model means very little for the big picture.


Mongo works well for storing chunks of JSON from transactional browser apps (which I guess is what HSBC is targetting), doesn't require you to have custom endpoints, has a good replication/HA story (on paper at least, if your devs know what they're doing and don't use defaults plus multiple nodes with CORS, etc.), good mindshare among webdevs, and is even the de-facto API for JSON stores with AWS' "DocumentDb" and Azure's Cosmos DB being drop-in replacements. But using it for banking/backoffice apps would be pure madness.


The whole premise of this seems wrong. I've worked on large financial systems that have happily supported different rules for different countries using a relational database. The database was never the issue, it was how to handle testing and feature enablement. This seems like the wrong cure to the problem, and a cure that will end up causing more problems than it solves. i.e. a hack. I'd love to hear more about the rational. I just don't understand it.


Each and every one of these MongoDB migration rationales contains some variation of "schema design is hard, let's go shopping."

Granted, there are plenty domains where you can get away with having a blurry schema with the occasional bug creating a new localized data anomaly, but banking, where such things lead to customer trust erosion - I just don't get it.


Oh my. This seems like a really bad idea. I'm calling it now, this will go wrong.

That "Micro Service Instance" is not a microservice, it's an Enterprise Service Bus (ESB), aka the Egregious Spaghetti Box.

That's aside from the fact that relational data belongs in a relational database.

I really can't imagine it being a single instance, that just does not make any sense.


When do they go live and how can you short them?

Ps. I think they never heard of database per service when looking at their microservice graph. Lol


probably safer to buy put options than short the stock (shorting it exposes you to unbounded losses if the stock price rises), but as you say, you'd need to get the timing right. https://finance.yahoo.com/quote/HSBC/options/

if they've got a decent QA process and they can't get the new system to actually work reliably there's a fair chance this project would get stuck in QA for a year or two, blocked from going live and cancelled, so it wouldn't cause catastrophic data loss, just reallocate a bunch of money from shareholders to contractors / employees / vendors.

> microservice graph

banks have graphs of macroservices.


HSBC's own website lists dozens of exchange traded warrants with various strike prices and gearing. Need a brokerage with access to the HK market though.

https://www.warrants.hsbc.com.hk/en/tools/search/ucode/00005...

I'm not sure about their choice of Mongo but the initiative seems like it should be helpful for customers like myself who have accounts in a few countries. We can access them all from one app but the experience is inconsistent at the moment. I'm cautiously optimistic.


Seems like the use case for graphql, not mongodb


Any time I have to use an HSBC online product (they throw in an occasional promotion for their credit cards and savings accounts) it feels like a trip back to the 90s - popups and extra windows on different domain names that are slowly "loading..." information for trivial operations like scheduling a payment on a credit card.


I think this isn't primarily about a database migration/consolidation and specifically not about MongoDB. IMHO it's about cutting cost and complexity by building a single world-wide software platform. Of course this also means that local specialities and local teams will be questioned and probably be made redundant.


I recently came to this article[1] where Guardian switched to PG from MongoDB

[1] https://www.theguardian.com/info/2018/nov/30/bye-bye-mongo-h...


Microservices and NoSQL - the wonder-weapons against every IT problem, aren‘t they?!

I really have to pivot my business - new name will be „The Microservice-NoSql Consulting Company“ and we will sell tailored versions of the „Microservice-NoSql Strategy“! (Changing names and colors of boxes for 2000$ per day)


The whole thing sounds like a terrible idea. Why would you want to put all the data from at least 65 different applications in the same table? In MongoDB? To save cost?

I simply don't get it. It doesn't make sense to me, but perhaps they know something we don't.


360 customer view as an example. If you are feeding a real-time decisioning, support, advertising system then you need all of the customer features accessible as fast as humanly possible. Ideally a consistent O(1) time.

Or it could be a real-time fraud system where you need every bit of customer data in one payload to feed into an ML model.

You can't do any of that if you are having to do multiple joins across federated databases.


I would collect data and push events to a new system, not migrate 65 relational databases to a single nosql database.


I don’t see how one global mongo is going to help solve any of their issues. Maybe they just want to bury their issues under the carpet a little longer.

I also hope they’ve configured their mongo correctly since out of the box Config is not appropriate for many use cases


I want to doubt a billion dollar company like HSBC will have engineers who will install things just out of the box and leave it without properly configuring it. But, they are proposing to use Mongo just almost a month[1] after a report came highlighting several issues with the product, ESPECIALLY for use-cases like a bank that requires a very high level of data consistency, so your point might actually be valid.

Would really be interested to know how something like Mongo ends up getting seriously considered for something like this, I mean the whole decision process that goes in to selecting something like this.

[1]https://news.ycombinator.com/item?id=23285249


I did a contract at HSBC in the past. Some teams are good but there are people who will install things out of the box and leave it.. even in a billion dollar company like HSBC.


Hacker News has become just “mongodb bad, sql good”


This sounds problematic:

1) mongoDB does not offer the guarantees required by a bank.

2) Application developers using mongoDB rarely protect themselves against NoSQL injections. Most of them are unsuspecting that they even exist.

e.g.: What is your user id? {$ne: null}... whoops, now every user record is returned.

3) They better make sure they use the Decimal128 type for their currency fields, and make sure that the value is not being casted to float on the client. e.g.: on a JS client, every number is a float.

Unless they've taken the necessary precautions this is a disaster waiting to happen. Hopefully other banks do not do the same.


> what is your user id

I don’t think you actually tried it. It returns user who’s id is “{$ne: null}” (string)


It depends. Sometimes you can trick an application to parse that as an object.


I would suspect they'd keep the core money transactions in the existing DBs, but use mongo for less important things like sending junk mail, customer support sessions.


I have seen many rewrites for financial institutions in my life; one of the biggest I had a good insight look at during the entire project was a Tandem Nonstop Cobol rewrite to Java; the famous 10s of millions of lines to nice clean Java code. Many of them never work; this one was, if I remember correctly, upwards of 50m Euros and was thrown away because it did not work in the end. I think the Tandems are still running.


It they just put commercial details follow up data + some stats in there, it's ok.

It is very infrequently updated, integrity is not at too much risk. In case it's lost, business is not impacted too hard as long as you can find contact details eslewhere.

But the replication story is good, and if they got tons of different schemas, it will make their life easier.

So yes, for transactions it would be bad, but maybe for this particular use case, it's ok.


I can see how they would use in some parts: analytics, internal reports, etc.

It is interesting that you can use Mongo and keep the data in a geographical location and still be able to run a global query: https://docs.mongodb.com/manual/tutorial/sharding-segmenting...


I think they’re just using it for document storage not customer and financial data. Document storage and retrieval is a big cost for large institution.


We found mongo expensive for documents. We had simple documents basically saying X sent email Y to Z. The mongo db was cloud hosted and was over 500gb and costing nearly 3k. Moved them into S3 with a simple SQL dB to look up reference IDs. Certainly reduced the cost!


To put this into context - I met with a senior tech person there (one layer below the group CIO) a few years back and he said they had something like 40000 distinct IT systems globally. So replacing 65 of them with one MongoDB instance is something like 1/8th of one percent of their systems.


65 databases unified globally, for HSBC’s global presence, is ~2 databases per country. This is, probably, a minor trial effort for one of their services, not the core system (which is un-unifiable due to banking privacy laws in place) or any kind of cross-function unification.


This article is misleading. They have migrated some of their databases to mongo, not all of them


I need to LOL a little bit at all this, the headline, the article.

First off, how would such an article even come about? Feels like more of a recruiting/marketing piece.

The old story at hsbc was that the ATM software hadn't been updated in twenty years, cause stability....


If someone was involved could you share details on the rationale behind this? I used to like mongo but has it changed.

I remember specific flaws/hacks with bitcoin websites coming down to mongo race conditions/inconsistency issues.


The article explains quite in depth the rationale behind this. Basically, they think it's too difficult to manage their current relational databases differently in each country, but they need that due to the different regulations and particulars on each country. With one global DB, they hope they can have one single application for all countries, with country-specific things hidden in only a few sub-documents.

_ It's now a one service environment, one database and one execution path for all the countries. This is made possible because of MongoDB's document model and the ability to map all the different table requirements for each country into a single collection, using sub-documents. Everything is simplified into one collection using country specific identifiers._

How they will address race conditions, eventual consistency and so on is not addressed at all though. I guess they trust Mongo's new transactions will take care of that (at which point, it would be nice to know whether that actually worked and what the performance will be when compared to the old relational system).


Relax HN guys.. this is the reason for this:

"Local requirements for each country will be built into the application, but there's no need to maintain separate data models or separate databases anymore. We could easily design the global data model and database using the MongoDB JSON schema model. That brings data from all operating countries into one database and the application can run on just one database. Which is a lot of reduction in resource and maintenance cost."

Is there any Database that can do this? other than MongoDB? I am hearing about data corruption in MogoDB, I believe that is due to bad, out of the box settings, which I believe they would have mitigated when using in cooperation with MongoDB company (which I am sure they are).


Whats the exact advantage, one can wonder. Each country still needs its own application. With its own models. With its own migrations, or at least ways of dealing with missing or mixed data types.

What if... those models map to sql database schema’s? wouldn’t that be magical? Not since 2008.

Whats left is maintaining multiple databases. That sucks indeed.

Sql databases support bson as well these days.

What they probably want (or did) is make a base framework, suitable for all countries. And have each country develop its own stuff on it. No need to use mongo for that though.

One use case i can imagine is that forms and input just change and that old information never will be compatible again with the newer forms. Mongodb serves as a giant more or less queryable data bin. Then one can ask the poor dba to lookup something for a client with “client_id”: xyz. Then return the raw bson/json output. Might be sufficient.


I just love software where the out-of-the-box settings lose your data.

At least HSBC are 'web scale' now!


POSIX filesystem?


Never mind the technology, a single data model would be a feat in itself.


I think the single data model in this case is probably something like requiring a couple of fields and everything else is whatever you need. Heh.


If you 've ever used HSBC web banking u ll know it's one of the worst things mankind ever created. I m not sure if this is good news for MongoDB


Is it common that a sufficiently large doc db is just an un-normalized rdb? I've seen swagger uis of doc dbs that make me wonder wtf/why.


What’s the over/under odds on betting about how long it’ll be until it’s accidentally exposed to the internet and hacked?


There’s a lot of complaining here but MongoDB has been used by startups for like a decade?

I don’t think I’ve seen a single HN article complaining about actual data loss. That would be something that would get upvoted immediately.

So what gives? Especially since earlier and older versions of Mongo apparently had far less data stability.

I’ve probably read far more complaints about Postgres in HN articles (difficulty setting up, poor defaults, etc). And Postgres may not even be as popular as Mongo. So what gives?


I’m in the UK and glad I don’t bank with HSBC


Wasn’t there a problem where mongo would literally lose records under high loads ?

There was an article on it about it a few years ago.

Did they fix that?


This is probably some aspect of HSBCs computing needs, likely less than 2% of the data the store.


Yeah, this is about par for HSBC IT.


Hiding your complexity behind a (poorly designed) layer doesn't make it go away.


How do they get ACID guarantees and schema enforcement?


MongoDB supports transactions and you enforce schemas in code instead of the database.

Pretty common these days to just rely on Git and not DDLs.


You can also tie json-schema descriptions to collections and have it throw validation fail exceptions on writes that fail validation.


And I am moving all my money out of HSBC...


Modern take on shredding papers


Yeah this can't go well...


Behind Bad Indian Coder ...


And it’s gone.

(Sorry)


While I'm generally against getting to meme-y in HN, this is spot-on, adding to the already-long list of ways banks can make your money disappear and/or make unreachable.

Reference: https://www.youtube.com/watch?v=_nVk25ZvTkU


shocking


That was a mistake


Lol. Just lol.


Uh, how does this work with eg GDPR or data protection laws that require all customer data to remain within the country of operation?


isn't this a bad idea


HSBC IT is a hollowed out outsourced shop. Just like Avis Budget, Disney, Walmart and Walgreens. They will all move to MongoDB data stores sooner than later.


And of course they will expose it to the web without authentication!


Old guys ... what dB they use in London and Hong Kong. Are they still use mainframe?


Lots of negative comments about MongoDB for folks who don't use it, but perhaps give NoSQL a chance and keep an open mind? there is no need to structure the data into tables and schema does not need to be enforced at the DB level.

Google has been running its entire infrastructure on a NoSQL database for years.

Perhaps folks need to keep some open mind and see if that works instead of dismissing a new trend?


> Lots of negative comments about MongoDB for folks who don't use it, but perhaps give NoSQL a chance and keep an open mind?

Yes, keep an open mind, but please do not use a database that has a history of just losing data in mission critical software.

> there is no need to structure the data into tables and schema does not need to be enforced at the DB level.

There is no need to, just like your car doesn't need safety belts to function.

A database scheme is mostly a safety mechanism as it will prevent you from doing something stupid.

> Google has been running its entire infrastructure on a NoSQL database for years.

Do you have a source for this claim? I would be inclined to believe that the data for my Google account sits in a relational database somewhere.


“Google has been running their entire infrastructure...”

I’m skeptical about this - Google’s entire infrastructure is so large that surely they must be using a wide variety of db technologies?

I took a databases course that did mention that part of the Google search infrastructure runs on a tailor-made NoSQL technology, but it wasn’t really a doc-based approach, it was more akin to a relational DB but with flexible columns (can’t remember the details sorry).

At any rate I don’t feel that this is any vindication for NoSQL, as Google builds much of its underlying tech from scratch and its hardly comparable to the out-of-the-box solutions that the rest of us, even massive corporations like HSBC, have to work with.


https://en.wikipedia.org/wiki/Bigtable

"Bigtable development began in 2004[3] and is now used by a number of Google applications, such as web indexing,[4] MapReduce, which is often used for generating and modifying data stored in Bigtable,[5] Google Maps,[6] Google Book Search, "My Search History", Google Earth, Blogger.com, Google Code hosting, YouTube,[7] and Gmail.[8]"

"Bigtable is one of the prototypical examples of a wide column store. It maps two arbitrary string values (row key and column key) and timestamp (hence three-dimensional mapping) into an associated arbitrary byte array. It is not a relational database and can be better defined as a sparse, distributed multi-dimensional sorted map."


For MongoDB it’s mostly a case of it had issue years ago when it was the new hotness, but it should be a boring choice by now, and it’s not. Neither is Cassandra. Kafka is boring, even more so if they kill of Zookeeper.

NoSQL has proven to be hard to get right than was first assumed and at the same time RDBMS’es have improved and adopted many ideas from the NoSQL world faster than expected, often with better results.

What’s no clear from the article is why MongoDB was picked, so maybe there’s something that made it the obvious choice, it’s just hard to see what the might be.


"For MongoDB it’s mostly a case of it had issue years ago when it was the new hotness, but it should be a boring choice by now, and it’s not"

SQL has been around since birth of computing, give NoSQL some time.

"NoSQL has proven to be hard to get right than was first assumed"

Nothing has been proven, again give it a chance, SQL had years of optimization.

Also, no need to downvote if you like SQL or your job depends on it, or that is all you know, just debate with reason and evidence, I've used both and open to change my mind with evidence and case studies which we don't have many yet and folks here predicting doomsday to HBSC without keeping an open mind.


I think you're conflating MongoDB with NoSQL; they're not the same thing.


I'm aware of that, but read the other comments, most of the criticism is around the nature of NoSQL versus using SQL databases, the most common criticism is no scheme and the underlying common assumption is that banks are better off with SQL, after all, that is what they have been using for years... maybe not?

The other statements are not even well reasoned, you read things like "this is madness" or "they up to disaster", "lol just lol", "they're gone", not explaining why. You would expect better arguments from the folks here.

Some pointed flaws in the design or implementation, and I'd argue with time and money, it will get sorted out, again SQL had years optimization and research.

I'd go further, I'd argue is that NoSQL is more flexible and easier to work with than SQL and I think it is a good thing that large organizations are given it a real try, so we have a large scale case studies, instead of completely dismiss the effort as a failure from the get-go.


> You would expect better arguments from the folks here.

You would? Why?


Because I think there are a lot of highly educated and tech people this forum. Thus, I'd assume they would more inclined to value reasonable and logical arguments in contrast to an online space like a general Facebook group for example.

You don't think I should keep that expectation?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: