> Locks limit access to the table, so on a high throughput use case it may limit our performance.
Then use a proper database that implements MVCC.
> Do not use transactions, which introduce locks. Instead, use applicative transactions.
Or just use a database that handle transactions more efficiently.
> `site_id` varchar(50) NOT NULL,
Why varchar(50)? UUIDs are 16-bytes. The best way to store them would be the binary bytes (which is how postgres stores them). If it's hex without dashes, it'll be varchar(32). If it's hex with dashes, it'll be varchar(36). Why did they pick 50? Future growth? Smart keys? Schema designer doesn't know what a UUID actually is?
> Do not normalize.
Bullshit. Normalize as much as is practical and denormalize as necessary. It's much easier to denormalize and it greatly simplifies any transaction logic to deal with a normalized model.
> Fields only exist to be indexed. If a field is not needed for an index, store it in one blob/text field (such as JSON or XML).
This is terrible advice. Fields (in a table) exist to be read, filtered, and returned. If everything is in a BLOB then you have to deserialize that BLOB to do any of those. That doesn't mean you can't have JSON "meta" fields but if your entire schema id (id uuid, data json) you're probably doing it wrong. It's next to impossible to enforce proper data constraints and all your application logic becomes if/then/else/if/then/else... to deal with the N+1 possibilities of data. Oh and when you finally add a new one, you have to update the code in M+1 places.
> Then use a proper database that implements MVCC.
InnoDB does implement MVCC. MVCC is not a silver bullet.
>> Do not use transactions, which introduce locks. Instead, use applicative transactions.
> Or just use a database that handle transactions more efficiently.
Easy to say, hard to implement at this scale. If you do a lot of writes and reads concurrently to a hot dataset, it's really quite hard to beat this architecture. This is why its such a popular and battle tested solution for many extremely high scale applications with workloads like this. Not to mention extremely well understood.
>> Do not normalize.
> Bullshit. Normalize as much as is practical and denormalize as necessary. It's much easier to denormalize and it greatly simplifies any transaction logic to deal with a normalized model.
But we are talking about performance... Having something in a single table that is denormalized is always going to be faster than having an elegant data model with "Everything In It's Right Place"
>> Fields only exist to be indexed. If a field is not needed for an index, store it in one blob/text field (such as JSON or XML).
> This is terrible advice.
So facebook/friendfeed, uber, dropbox, and many more or wrong then. Ok.
This is really all best practice for running something like this.
Of course it flies in the face of best practice for running a smaller system. Is there tradeoffs? Absolutely! Would it be smart to do this if the need for this scale is not obvious? Probably not.
You end up having more logic in your application and coordination layers, but this is all pretty good advice for people at this scale, and certainly not bad at all.
From the article:
> The routes table is of the order of magnitude of 100,000,000 records, 10GB of storage.
> The sites table is of the order of magnitude of 100,000,000 records, 200GB of storage
That's tiny. Both of those easily fit in memory on modern hardware. This isn't cough web scale, this is peanuts.
The savings from having a simpler system that operates both transactional and the lack of disparate CASE/IF logic would win over this monstrosity of a design.
For a counterpoint where this type of model makes more sense check out Ubers data model. Similar setup but more applicable use case and (without having any inside intel on it) I'd wager is justified.
Sounds like we are in agreement then - at high enough scale - this is solid advice.
No my argument is this particular design is both unjustified for the use case and poorly thought out/implemented. The uuid as varchar(50) is a dead giveaway of amateur status.
Like you said, 10GB of data isn't very much, it really doesn't matter if you go with NoSQL or SQL. But SQL will probably give you more flexibility and will be easier to manage until you get really, really big.
* A lot of traditional sql solutions have scaling issues.
* A lot of nosql solutions for these issues have become popular in recent years. Their main use case is web scale (simplification)
* A ton of actual use cases fall between those two chairs i.e they would have scaling issues with traditional sql systems but don't really have billions of transactions to process.
The article presents some valid real world approaches for this third case, some of it might be case dependant but overall pretty good advice.
If you have scaling issues with traditional SQL systems, the first thing you should do is understand your problem, not cargo-cult to a different paradigm.
* You do not get to complain about SQL transaction speed if you use denormalized tables.
* You do not get to complain about SQL transaction speed if your primary keys are varchar(50).
* You do not get to complain about SQL transaction speed if you store all non-identifying data in a single json blob.
This one folk wisdom that is untrue. There are significant speed disadvantages relating to large blobs of data the database doesn't understand. Serialisation time makes returning large JSON/XML objects expensive when you only need a small part. Overwriting a whole object to increment a counter is an unnecessary source of IO. Duplicating JSON keys in every record bloats the size of your working set, making it more difficult to fit into memory (or the fast part of your SAN).
99% of denormalisation out there is unnecessary and has inferior performance. The best route to performance with row store SQL databases (any database?) is two fold: 1) get an expert on your database to help you write your code and 2) get an expert on your hardware to help you choose wisely. Denormalisation is typically a way to take a performance problem and make it worse while introducing data corruption and race conditions.
> Serialisation time makes returning large JSON/XML objects expensive when you only need a small part.
the expensive part of reads is finding the row on disk. once you've found the row the cost of reading part of the row vs the whole row is negligible. amount of data sent over the network doesn't matter either in 99% of cases. these days network bandwidth is orders of magnitude greater than random IO bandwidth on a spinning disk and still greater than random IO bandwidth on a SSD assuming you're using 10GbE.
> Overwriting a whole object to increment a counter is an unnecessary source of IO.
there is no way to write 4 bytes to a hard disk. disk io is performed in multiples of 4096 bytes, so it doesn't matter whether you just update the counter or update the whole blob. only incrementing the counter may allow you to write less data to the write ahead log, so you may save some IO there, but most databases put the whole row in the log anyway so it doesn't matter.
> Duplicating JSON keys in every record bloats the size of your working set, making it more difficult to fit into memory (or the fast part of your SAN).
this is definitely true, it's better to use an extensible but compact format like protobuf or thrift if you have a significant amount of data. or you could use a better database and not have to worry about the cost of adding columns.
There's a lot to migrations in these various implementations but in short: every variant of alter table is an "online" operation (meaning it doesn't hold an exclusive lock) in InnoDB as of MySQL 5.6 and it's possible with various well-supported third party tools before that. For Postgres: most migrations can be done online and those that do hold a lock are typically constant time.
Admittedly migration has been a big problem in the past but that hasn't been true for years now.
> the expensive part of reads is finding the row on disk
Hopefully most SQL queries in an optimised setup are not finding a row on disk! The difference between reading the whole row from on-disk heap storage and reading the interesting part of it from an index in memory is in fact considerable: 1000x or more - and obviously far worse if you have to scan through any of the heap storage.
> amount of data sent over the network doesn't matter either in 99% of cases
It actually matters hugely in the case of JSON blob storage because it all has to be doubly deserialised on the other end - first from the db's wire protocol and then from JSON. There are many apps out there for which JSON serialisation is a bottleneck (for the CPU, not the network) - that's why there are so many "fast JSON" libraries.
Good point - you could mitigate this by using something quicker. I haven't seen anywhere do that - the ability to read the database with other tools is normally useful
> most databases put the whole row in the log anyway so it doesn't matter
re: this whole topic - I'm not proposing switching a JSON blob with a table for which every JSON field was reified into a column. I'm comparing to a normalised design with narrow tables (most tables have fewer than 5 columns). The other stuff about serialisation applies.
Unless you specify the workload, that's anywhere between completely true and exactly incorrect. Do you have big values you're always interested in and a couple of tiny ids? That's probably going to be faster in one table.
Are you querying only the metadata most of the time and the big value is multiple KB, almost never accessed? You're just killing your readahead and multiple levels of caches for no reason. "always going to be faster" is always incorrect ;)
That said, use a system that's a good match for your data model(s)... if you're data can fit in a handful of collections, but may have varying object shapes/structure for semi-related data a document store may work better. Need massive scale, C* may be your best bet. There are use cases that are great fits for just about every database server that's been made. Some similar options may be a better fit, but it all depends.
Personally, I'm hoping to see in the box replication options for PostgreSQL in the next few releases, then it will probably be my go to option for most stuff. RethinkDB is pretty damned nice too, and you should be able to scale to many millions of users.
Once you hit certain scales, you usually have to implement a number of solutions... be they sharding, caching or queuing to deal with higher load. It depends on where your bottlenecks are.
Not always, particularly when you're returning multiple keys at once. Some comparisons with column stores would be instructive.
I do agree that UUIDs should be stored differently -- the use of varchar rather than a fixed length type for a primary key will hurt performance.
I would love to switch to native UUID someday though.
ALTER TABLE my_table ALTER COLUMN my_uuid TYPE uuid USING uuid::uuid;
Conversion between the database-native UUID type and the application-layer UUID type is often so flaky (particularly with Java and JPA).
Tip: There is no performance difference among these three types, apart from increased storage space when using the blank-padded type, and a few extra CPU cycles to check the length when storing into a length-constrained column. While character(n) has performance advantages in some other database systems, there is no such advantage in PostgreSQL; in fact character(n) is usually the slowest of the three because of its additional storage costs and slower sorting. In most situations text or character varying should be used instead.
In innodb locks are row-level; myisam supports table-level locks though that's not and shouldn't be confused with a transaction; I don't know what a "database-level" lock is supposed to mean, are they really saying they're locking all tables to do a write? It doesn't sound like this author understands what a transaction is.
As many have pointed out, so much criticism of this article ignores that it is comparing to other key-value stores which are not transactional. Many of what this would compete with are AP, with Consistency not guaranteed.
It really sounds like they should be talking about MySQL cluster, which is protocol compatible but a completely separate implementation and essentially a key-value store with RDBMS attributes atop it. It supports many-master mode like mongo and other distributed systems, which is fairly mandatory for replacing them. It's hard to argue you can replace HDFS with anything that's not distributed, and if you didn't, why wouldn't you just use .. the actual FS? The author may not really understand that HDFS is optimized for storing large-ish files.
For example, 'Do not normalize.'
This was in the context of a read heavy table that competes with NoSQL. In that context, I think this is accurate. We noticed a big difference after denormalizing when we went from millions of rows to billions of rows.
The general advice of SQL solutions being as useful as NoSQL to a certain scale is good. I don't think the individual examples are horrible, but they aren't universal advice to achieve NoSQL performance.
> Why varchar(50)? UUIDs are 16-bytes.
Why do you think it's a UUID?
> The best way to store them would be the binary bytes (which is how postgres stores them).
Is it actually better than a pair of BIGINTs?
Because of this:
>Also notice that we are not using serial keys; instead, we are using varchar(50), which stores client-generated GUID values
As long as it meets the statistical properties for collision, I don't see any problems with making a 50-bytes GUID. The essential thing is the statistical property, not the number of bits.
Like I mentioned, there are different UUID/GUID schemes using more than 128 bits publicly available already. You don't have to invent one.
They may simply be using one of those with additional bytes reserved for future algorithmic changes (using first few bytes as a header to specify the algorithm).
Or just char(32) no need to note the length if it's always the same.
Oh and citing statistics without details is plain lying, how many server, how much RAM, SSD based or HDD....
I think the important thing to note here is that there are lots of different ways to use any given tool that can fit your use case without being an atrocity.
Personally, I'd be inclined to use PostgreSQL for most things once they have an in the box solution for replication and failover that doesn't require a commercial contract for a sane solution. Every time I've used mySQL (admittedly not for about a decade), I wind up with a lot of WTF moments. If I'm paying for it, may go with MS-SQL, or a hosted "cloud" provider as tbh, no interest in being a DBA.
I really like RethinkDB's approach as a no-sql solution, and their admin interface is second to none in this space. If you really need more scale (more than around 20 servers or so), then you should probably be using Cassandra, and/or supplementing with the likes of ElasticSearch.
In any case, if you want real-time data for hundreds of millions of users, you're going to have data partitioning and redundant replication and caching solutions.
I worked at GoDaddy on Website builder while they were transitioning their publishing target to C*, and have to say it performs incredibly well as basically a distributed file system in the role described in the post article.
Uber built something similar in the last couple of years.
So did Dropbox.
HStore is a key value store built directly in the RDBMS of Postgres.
HStore was released, we've migrated to PG and we can't be happier. Zero issues so far.
I agree with the Wix team. I've used Mongo in a very high volume site and it was a constant source of headaches. We even had an engineer we called "MongoDB" because he was constantly having to deal with scaling it. It didn't even handle more than 10% of the application data, it was strictly for timelines; the rest of it was in trusty old MySQL. If we'd just done blobs as JSON in MySQL in the first place we would have been completely fine.
Even though I prefer Postgres, knowing nothing else, I would prefer 100% SQL in MySQL than a Postgres backed application that splits the storage between tables and KV store.
Well, to be fair you would typically expect a database that's the equivalent of a remote hash table to be pretty much as fast as you can get. Now I don't have any experience with couch but most of the other key value stores I've used they scream with performance. But if you're doing anything beyond basic manipulations then it's going to require a lot of tuning depending on the solution you went with.
But RDMS can be very similar. Both are useful tools when used correctly and there is a huge amount of overlap in terms of capability.
Most applications are probably fine regardless of the underlying data store, and are better served by what works best for the model of the data, which can often be SQL. I'm more inclined to react for mongo or rethinkdb first, as I tend to think more in terms of more complex records/objects in most of the applications I've worked on.
Hstore is more k->k->v, which is the same but different, and also leads inexperienced developers to model entire relationships in a single column
I have some experience with this having worked on an experimental storage engine for mysql that we connected to a transactional in-memory k-v store. The performance penalty for simple k-v workload through mysql was quite substantial, though our storage engine code was probably not sufficiently optimized. It would be interesting to explore this for innodb though
I can say that 10 years go, I would have chosen M$SQL over MySQL and it would have been the correct choice. At the time I had almost 10 years experience with M$SQL and almost none with MySQL. Now I have more than 10 years of MySQL under my belt. AND the MySQL experience is more current. Right now I could choose between the two based on specific features and performance characteristics. For me to pick Posgresql because of a specific feature would be insane because I don't experience with it. No knock on Posgresql ... maybe I'll spend time with it and pick it for some down-the-road project.
I have implemented couchdb as a caching solution. I know how to manage, backup, and restore the database server. I have managed a 5 node cluster. If you ask me to implement NoSQL, it would be my choice for 2 reasons:
1. It can do the job.
2. I have experience making it do the job.
I'm sure there are 10 million people out there would would choose mongo in the same situation. The would not be wrong and they may come up with a superior solution. For me to implement Mongo today would be wrong - I would almost certainly come up with an inferior solution. for them, it would be stupid not to.
I'm not saying "don't learn anything new". I'm saying "don't gamble your business on technology with which you're not familiar".
Its a bit like backups ... the most important thing about a backup is not the technology you use, but whether you are capable of restoring and maintaining the backups.
Secondly, the article isn't about what is "better" overall. It's about scaling SQL, and how noSQL isn't always necessary. The "in" thing to do right now is to have noSQL in your stack, blindly, without looking at your project. Or doing expensive migrations to noSQL solutions when you already have an expansive infrastructure built on SQL but need to scale. Wix is just giving insight into their techniques with MySQL and how in the end it made more sense for them than going with something like Mongo.
TL;DR: You didn't read the article.
(Yes, they say active-active-active, not master-master-master, but then they say across DCs... It could be just M-S-S with config switch on failover, but for me the post suggests it's not that)
It's weird to me that they're hosting web sites that need inline MySQL at all, but I suppose they probably do a lot of fancy backend stuff. Much better to async queue any data sending unless it needs to be inline. Much better than that is to static cache.
We're straight up static + nginx for all site serving, via an anycast geo CDN. Logging is passively provided by the nginx logfiles, which are parsed hourly asynchronously. It's a pretty good system, and about as fast as site delivery can get. I guess I don't know what Wix offers for site features, but for simple web sites, why bother with a database for web site display at all?
That said, I agree with the article's premis. I'm partisan to PostgreSQL but it's dumb to say it doesn't work and that some wacky new TrendDB that doesn't handle fsync or atomicity correctly (and therefore is TEH SCALE!) is somehow better for this job.
A setup that works for a certain service won't necessarily work for another unless yours is a very close replica. Based on my experience in this area, and I'm a performance seeking nut, each platform, and even each traffic pattern, needs its own thinking hat.
That's what makes it so fun!
Minor note but wouldn't UUIDs be better since they're time based? Sure it's really unlikely to hit an already used GUID but an UUID makes it impossible.
In fact is there a use case where it's better to use GUIDs over UUIDs? I couldn't think of one but I could be omitting something from my thinking so I'm curious.
Edit: apparently GUID and UUID are the same thing and GUID is simply Microsoft's original implementation of UUID. All this time I had no idea...
Consider UUID4, the one with 122 random bits. The birthday paradox says that you would need about 2^61 UUIDs before you expect even one duplicate. If this concerns you, you might not recognize how big 2^61 is.
(edited because I was originally talking about 2^64, but there are 6 non-random bits in UUID4)
Let's make this real concrete, with 122 random bits, you can issue a million UUIDv4s every second for the next 100 years and still have a less than one in a million chance that you issued a duplicate.
n = 5316911983139663491615228241121378304 (2122)
p = 0.000001
=> m = 3260955271619137
3260955271619137/(100000086400365) => 103
So Wix uses MySQL to resolve site routes internally? Is this the best way to do it? Would it be possible to use internal domain names and rely on DNS to resolve everything?
Re: nosql, I'm coming at that with really positive experiences in Cassandra but I can't imagine what kind of DNS system it would be a good fit for. The ability to tune CAP to fit DNS may be useful but in general I think of Cassandra as the solution you think about when your 2 node vertically scaled monster can't keep up.
You get free caching which is tolerant to partitions, but you'd get that with either solution since it's largely the forwarding dns servers which make that happen and the difference here is constrained to the authorative resolving dns servers.
For companies other than Wix I don't know what is used to handle it on the back end but I imagine it's either some specialized piece of hardware than can handle an insane load or some commodity hardware / cloud service & in-house software like here.
At the end of the day, and I'm sure I'm missing some edge cases, I think it's basically a service that provides a mapping of domain.com/user-resource-or-website to the location of their resources with no lag time when changes are made.
But over time I finding less and less reason to _not_ use PostgreSQL when contemplating a NoSQL document store.
On top of that, MySQL is well-documented and lots of people use it for lots of different things, which makes the most common (and most of the uncommon) problems eminently google-able.
Anecdotally, I've avoided Postgres also in part because it had the same problem that Perl had when I got into Linux: the community was pretty toxic and unwelcoming. I asked about replication once, and I got a bunch of answers that fell into a few categories: 'Why on earth do you need replication?', 'You don't need replication, Postgres is fast enough', 'set up a script to rsync your data directories over and over again', 'mysql replication is unreliable so you shouldn't use it so it doesn't matter'.
That sort of attitude drove me (and, I'd wager, a lot of other people) away from Postgres and towards the vastly more welcoming MySQL community, and I'd wager that a lot of people just kind of stuck there because why switch?
Postgres has added a lot of pretty great features lately, and it seems good? But I'm left wondering why I should use Postgres for my RDBMS, my JSON data store, and my KV store, when history has shown that one piece of software that expands to do everything rarely does any of it well.
(Postgres has since gained replication abilities)
I seriously disagree with that. Documentation is one of MySQL's weakness, especially compared to PostgreSQL which is known to have one of the best documentation in the open source world.
I really love the PostgreSQL manual.
I ask because I always miss this feature when querying MySQL.
"we had the MySQL engineering talent we needed to work with the Oracle team to get 5.6 ready for production at our scale."
"We all worked hard to adapt 5.6 to our scale and ensure that it would be production-ready. We found some issues after production deployment, but in many cases we could fix the problem and deployed new MySQL binary within one or two days"
"Performance regression of the CPU intensive replication was a main blocker for some of our applications" followed by a description of how they addressed that.
So it's not vanilla MySQL vs vanilla PostgreSQL. They tailored MySQL to their needs and keep honing it. What they do has little resemblance with what the other 99.9999% of companies do, and I'm probably missing a few 9s. Another excerpt from that post highlights the differences:
"For example, typical MySQL DBA at small companies may not encounter master instance failure during employment, because recent mysqld and H/W are stable enough. At Facebook, master failure is a norm and something the system can accommodate."
My take: if they started with and stuck to PostgreSQL they'd have to work on it as they did on MySQL.
Interestingly much Facebook database work has been outside the relational orbit on systems like Cassandra and HBase, as well as RocksDB. The choice for most web properties today is not MySQL vs. PostgreSQL but RDBMS vs. "other" systems.
They continue to use MySQL on new projects.
http://www.woorden.org/woord/flikker (tl;dr: homosexual)
Our response -
I don't have experience with MongoDb and such, but I've always asked myself why someone wouldn't use Solr as a distributed NoSQL database... Am I wrong or, with Solr, you get that key/value scalable storage AND you get advanced search features as an extra?
Why would I want to use MongoDb instead of Solr? What killer feature Solr doesn't have?
Some issues are:
async indexes, unable to modify/remove indexes, unable to grow/shrink number of shards etc (basically search why not use es as primary data store)
How are they performing horizontal scaling, I'm guessing they aren't, without addressing the issue of sharding and scaling they can't really compare the solution to NoSql - it is the number 1 feature that NoSql has over RDBMS.
If they are achieving 1ms response time , then they almost certainly have the entire table in memory cache.
What happens when the data grows beyond the size of the memory and it's not financially feasible to get a larger memory instance.
2. 1ms is achievable with SSDs, but 200K q/minute seems slow my gut feeling tells me.
This post is more like "ha we don't need NoSQL for this special use case" - Once you need scaling and some sort of atomics, you quickly have to use HBase for row-level atomicity and scaling.
Redis is probably better suited for the posted usecase anyway.
1. Use PostgreSQL, or MySQL with InnodDB for row level locking
2. Huge tables should be sharded with the shard key being a prefix of the primary key.
If you need to access the same data via different indexes then denormalize and duplicate the index data in one or more "index" tables.
3. Do not use global locks. Generate random strings for unique ids (attempt INSERT and regenerate until it succeeds) instead of autoincrement.
4. Avoid JOINs across shards. If you use these, you won't be able to shard your app layer anymore.
5. For reads, feel free to put caches in front of the database, with the keys same as the PK. Invalidate the caches for rows being written to.
It's actually pretty easy to model. You have the fields for the data. Then you think by which index will it be requested? Shard by that.
Note that this will still lead you to a huge centralized datacenter!! Because your authentication happens at the webserver level and then you just have all the servers trust each other. While it is a nice horizontal architecture, it leads to crazy power imbalances like we have today. Consider instead making it a distributed architecture, where the shards turn into domains, and each user on each domain has to auth with every other domain. But your network can then be distributed without a single point of failure. What's more, local area networks will be able to host your app and be quick without the signal bouncing halfway around the world.
How FriendFeed uses MySQL to store schema-less data
Edit to add the HN discussion at the time:
Like many people I investigated the NoSQL movement for potential applicability, and almost swallowed the hype. As I investigated more, I realised:
1. There are some specific instances where a NoSQL engine makes good sense. They're a valid option and should be considered depending on the application. In my experience though, well formed RDBMS structures are the better option in the vast majority of applications.
2. Most of the hype and growth came from people who (a) were using the abomination known as ORMs which are the canonical example of a round peg in a square hole; and/or (b) didn't know how to build performant RDBMS schemas. For these people, the NoSQL engine was fast because it was the first engine they actually learned how to optimise correctly.
Indeed, the "Vietnam of computer science."
This is a key/value store inside an RDBMS that just works, and it works great!
I converted a crappy sloppy super messy 1000+ column main table in a ~800GB database to use hstore, it was, in real world benchmarks, between 7x and 10,000x (yes, really, ten thousand times) faster.
The CEO of the company who had a technical say in everything, and was very proud of his schema "wasn't excited" and it never happened in any production instance.
I've left since then, and the company has made very little advancement, especially when it comes to their database.
Really, just use hstore. Try it out. The syntax is goofy, but... I mean, SQL itself is a little bit goofy, right?
The proper pronunciation of SQL is SQuirreL.
NoSQL databases are for BIG data. As in, billions of rows big.
Replicating the data to mongodb at the time, with a thin API for search and read queries, and omg, it was a staggering difference. Beyond just caching, all the search queries. Today, I'd be more inclined to use ElasticSearch (there was an issue with geo indexing at the time iirc)... just the same, it really depends on the shape of your data.
I feel that the storage shape should serve the needs of the application. SQL databases encourage normalization to such a degree, that it's costly to construct a semi-complex object in memory, especially when your queries will run across many tables for many thousands of users. Joins kill performance at scale... If you can get around that, you're often better off.
Duplicating data to a system that is a better fit for mostly-read/query scenarios is a good idea. There's nothing that says you can't have your data in two places, and it's easy enough to setup services that copy on update.
Does that mean you've stopped using Mongo altogether?
When I cast my eye over a table with foreign key constraints, I am 100% certain that every single row conforms to those constraints, and always will.
By contrast, when the same table does not have constraints, but instead relies on some business logic layer to enforce them, then I have to consider whether there might be corrupt rows put in there by:
- bad business logic code
- bad import scripts
- some contractor who used to work for us 5 years ago and briefly uses his php script to push up some data
Scenario 1: what if a system need to migrate to different database system, then the whole business logic are need to totally re-implemented with the destination system DSL.
Scenario 2: if system need more just one storage system to persist business states, for example, I use db to store image metadata and use s3 to store the image? I don't believe the foreign key constrains will still works.
Scenario 3: if we have system need to process business state in asynchronously, for example, use message queue.
Also think about how to do unit tests (this is also how we keep the business logic correct) how to do CI/CD. System design is more than just a ERD design.
> Scenario 1: what if a system need to migrate to different database system, then the whole business logic are need to totally re-implemented with the destination system DSL.
If it takes you more than a few minutes to rewrite a foreign key constraint to work on a different target database then you're doing something wrong.
> Scenario 2: if system need more just one storage system to persist business states, for example, I use db to store image metadata and use s3 to store the image? I don't believe the foreign key constrains will still works.
You're right, it doesn't - you can't use a foreign key constraint there. That doesn't mean you abandon them altogether, just use them where they work - in your database.
> Scenario 3: if we have system need to process business state in asynchronously, for example, use message queue.
Then use a message queue. Still no reason to give up on the declarative integrity checking of FK constraints inside your database.
No idea what you;re talking about wrt unit tests - perhaps you're saying you want to allow bad data into your database so that you can run your tests? In which case here's a better idea - use FK constraints to never let it in in the first place, and don't bother testing, since the DB vendor already did it for you.
Any info how "active-active-active" (I assume 3 aws regions) is accomplished?
But there is the awkward replication model, the lack of native data structures as column type and the lack of sharding support.
This is basically them failing to do enough research into existing solutions that would work far better.
On the other hand, many NoSQL databases like MongoDB and RethinkDB have a query language which was designed to run on both single-machines and distributed infrastructure (in a homogeneous way); the same queries which work on a single machine will also work at scale on multiple machines - No need to rewrite your queries as your app grows.
You CAN scale with SQL but you have to know what queries to avoid (E.g. table joins, nested queries...) but with NoSQL, you don't have to avoid any queries; if it's in the Docs, it's safe to use.
Finally, a major difference between SQL vs NoSQL is the typed vs untyped structure. Most SQL databases were designed in a time when statically typed languages were mainstream; so it made sense for SQL databases to enforce static typing on their data.
Generally, NoSQL DBs impose constraints to query expressiveness in order to free you from architectural constraints.
SQL DBs impose few constraints on query expressiveness but because of this, they add architectural constraints to your system.
To pick a quick one: "query language which was designed to run on both single-machines and distributed infrastructure". Mongo has no fewer than THREE query syntaxes: standard, map-reduce, and the aggregate pipeline.
 which even Mongo employees recommend people avoid like the plague https://www.linkedin.com/pulse/mongodb-frankenstein-monster-...
HA HAHA HAHA
Care to elaborate more on this? What do you mean by live migrations?
As for your question, NoSQL datastores can be grouped into multiple categories:
- column stores (like hadoop, cassandra, informix), which optimize for sharded and distributed storage of related data elements
- document stores (like elasticsearch), which focus on metadata organization for large opaque (binary) objects
- key-value stores (like redis, openldap), which are basically unstructured, associative arrays (hash maps). They allow the most storage freedom, and are hardest to optimize.
- graph databases (like neo4j, trinity), where more information is carried in annotated inter-object links than in the objects themselves.