Oh my heavens, I wish every piece of technology described itself this way!
 - Slony -- https://www.slony.info/
Do they have constraints on rows? Are they using views or do they just denormalize and duplicate? Do they use joins at all? Are they even doing more than 1 thing in a transaction?
Normalization is a performance optimization. Denormalization is a development shortcut. Neither is right or wrong but I would be surprised if a 50TB OLTP database wasn’t already highly normalized.
If it isn’t then my next guess is that it could be made smaller or more performant if it was.
We used to be proud of server uptime, back when we gave them names. Today if you have a server up for 900 days you’re going to be accused of malpractice.
Similar for data. We used to be proud of being able to keep “big” data online, but I’m no longer impressed. You’re just spending money. Did you actually solve the business case as efficiently as possible given available resources? Do you need 50TB in the first place?
> Normalizing data can improve write performance but will make reads slower, so if you have a write intensive database then you will see some performance gains.
In OLTP this might be true, based on access pattern. Normalization may improve both read and write performance. It could also make it worse if you take it too far. Domain Key normal form could kill you in unique indices alone.
In OLAP different levels of normalization actually helps both read and write performance. Take a look at the star schema to see how that works. In general dimensions as they relate to fact tables are normalized but dimensions themselves are denormalized.
Normalization is a tool like anything else. You can’t make absolute statements about it. The appropriate application will depend on your use case and may change over time.
That's exactly my point. I specifically said that normalization is independent of optimizations, it might help, it might not. It was specifically in response to your absolute statement about normalization being critical to performance, and I quote:
>Normalization is a performance optimization.
Emphasis taken directly from your quote.
Normalization has very little to do with optimization and it's one of the least cited reasons (if cited at all), for doing it.
On most servers the most plentiful and cheap resource is the CPU (not the storage!), and an adequate level of normalization loads CPU while relieving storage.
Normalization often adds storage 'seeks', and therefore doesn't costs much on some storage (SSD...).
Denormalization is a performance optimization. By duplicating data you reduce the need for costly joins at the expense of data consistency.
In the last year my team did a lot of iterating on our DynamoDB schema and eventually just re-discovered normalization.
From what I can tell the benefit of databases like DynamoDB is that you can shard your workload over many hosts mostly transparently. So you get the benefit of more resources than fit in one box. But it’s not magic and you pay the price in other areas, such as hot partitions, and implementing join logic in your application.
Also Postgres is the subject at hand and is relational so normalization is an unavoidable design decision.
Said another way normalization is a fundamental concept to modeling data in general. It might even be the fundamental question. You have to balance your requirements to arrive at an answer.
In our case many of our objects had redundant data (aka denormalized) so updates required multiple calls to the DynamoDB service. By normalizing we saw throughput gains in our application and reduced service calls by taking fewer trips. Additionally we had conflated a couple of our domain-specific concepts in the data model and by splitting what was actually two independent entities that had been modeled as one we reduced the absolute record count.
I describe these optimizations as "making the data smaller" and "normalization".
that is a stupid statement.
because it's way too generic and it depends on the use case. read heavy data that needs a lot of joins are most often denormalized IF updating a lot is not a problem. sometimes you need to create views that pull in different stuff with different queries which would make them not really performant especially not on postgres which is just super slow when it comes to listing/filtering data.
Some old inactive data can be moved into partitions, then detached and moved to a different DB instance, as an archiving step.
It reads like their processed payments (5000 tx/second) are very suitable for daily, monthly, yearly archiving.
For analytics, a summary stab can be kept in place of the detailed (archived) transactions.
For statutory reporting, the slow archives can be accessed on demand.
Overall my takeaway is basically "if you want to upgrade a large Postgres db, you'll need like an hour of planned downtime and a lot of careful work" which… doesn't make me excited to upgrade big postgres db's in production.
HA is one of those things where MySQL wins hands down, sadly. Sadly in the sense that PostgreSQL HA still looks like a couple of hacks held together with duct tape, at least when compared to MySQL solutions.
The Percona MySQL distribution has multi-master HA (Percona Xtradb Cluster) which is great for HA, and vanilla MySQL 8.x has group replication which iirc also is multi-master. MariaDB instead has multi-master based on Galera Cluster (although i'm not very knowledgeable about MariaDB).
In PostgreSQL world there are many HA solutions, none of which is both multi-master and open source (sadly).
Speaking from operation experience with MariaDB Galera and Percona Xtradb Cluster (which is also Galera).
This statement (and the following phrases) is so vague that the only sensible thing that I can reply is that you're probably using it wrong, and that you should go back to the fine manual.
We run (several) Percona XtraDB Cluster (PXC) and even though there is some occasional crash (usually a node that goes into OOM) it's very stable. We never had, so far, a whole cluster go down. In the worst case a single node was still up, and was operative while we recovered the other nodes.
Of course stability doesn't come from for free: we spent some time using PMM (Percona Monitoring&Management) to find bottlenecks, and we keep PMM around for developers to troubleshoot performance issues on their own.
At the same time we just don't grant risky permissions to developers. One of the riskiest grants when dealing with PXC is ALTER TABLE, which requires the appropriate tooling (pt-online-schema change from the percona toolkit) and also requires following a procedure. Need an ALTER? Open a ticket. Ez.
What can I say?
If PXC doesn't work well for you, you're probably using it wrong.
Also note that galera is just a set of API (https://github.com/codership/wsrep-API), which can be implemented against other databases to bring the same replication capability.
Yes, we did that too, it's the only way to reduce the deadlock and certification failure with Galera. However, it won't prevent a single `ALTER TABLE` to lock up the entire cluster. You can use `pt-online-schema-change` to workaround that, but it's not perfect.
We've switched to simple Asynchronous Replication with Openark Orchestrator for the majority of our clusters and the DBA team is much happier now.
Gitlab wrote a much more in-depth rundown in 2020 of a pg_upgrade, which in their case took 2h of downtime and 4h total time: https://about.gitlab.com/blog/2020/09/11/gitlab-pg-upgrade/ (submitted to HN here: https://news.ycombinator.com/item?id=29926316). They included a video of the whole thing which is pretty incredible.
Site wise no but DB-wise yes.
If someone here commented they had a 2PB database, I guarantee someone else here would be like "pfft, that's not big"...
A 130M record table with no indexes is going to be crazy slow. Although if all you need are primary key updates, then that's the way to go.
Some time ago when I studied bitcoin ledger structure, I was confused about how it does sum up all transactions to get the balance so quickly, like we in accounting usually do, with the help of some period caching, which is another point of failure and maintenance. Bitcoin is a large enough database to not do that easily. Few docs later I realized that it doesn't do this:
mining -> A 1
A -> B 0.2
(sum(A) == 0.8)
mining -> A 1
A -> (B 0.2; A 0.8)
(last(A) == 0.8)
For what it's worth, in a system design context, this is called event sourcing (in more general terms it's just a fold, though the idea of event sourcing doesn't preclude caching). I worked at a bank for a few years and this was how we calculated balances too.
> And instead it does this:
I don't quite follow you here. It does something distinct from either a fold or caching the current total? (Do you mean that each transaction encodes the resultant balance?)
If the input is worth 50 BTC but you only want to send 25 BTC, Bitcoin will create two outputs worth 25 BTC: one to the destination, and one back to you (known as "change", though you send it to yourself). Any input bitcoins not redeemed in an output is considered a transaction fee; whoever generates the block can claim it by inserting it into the coinbase transaction of that block.
I.e. any "input" is spent completely by a transaction and this makes it irrelevant to future calculations.
See also: https://bitcoin.stackexchange.com/questions/13069/how-does-t...