I'm impressed, I'd say a lot of projects couldn't have handled this so nicely
They are not the prime example I would use for DOTADIW.
Puts me in mind of something I've seen done in Redis, and in the OSX (really NeXT) CoreFoundation container classes. In both, you instantiate an abstract data-structure based on worst-case time-space guarantees you want from its API. Then, the implementation actually instantiates it as one of several underlying concrete classes, usually starting with one that wins for low content sizes just because of fewer dereferences needed (e.g. a "ziplist" in Redis); and then migrates the content to a different underlying concrete container transparently when content reaches a certain size.
In both cases, you're giving the user a cleaner abstraction that they can think less about, but you're accomplishing it by making the computer itself do far more internal book-keeping, and creating a much more heavily-engineered design.
Color me impressed as well.
Except the name is a bit clunky and hard to write ;)
That's what they said in 1996. Then they said it again in 2006. In 2026 when its interwoven in all sorts of AI systems and still causing confusion, they will be looking back thinking why didn't they fix it back in 2017 when they still could. "Now" is always the right time, it's never too late.
are both very good, short, and difficult to confuse with another project or product
i think psql refer specifically to the command line tool of pgsql
I'll take PostgreSQL's name (which I consider clever) any day over some name completely devoid of originality or thoughtfulness, such as most of the names Microsoft uses for its software.
Let's see how many iterations we'll need for people to actually notice what was pointed out here ;)
"He lives in Helsinki with his second wife Anna and daughter Maria (after whom MariaDB was named), and has a daughter My (after whom MySQL was named) and a son Max (giving the name for MaxDB) from his first marriage"
Besides, Widenius is Swedish speaking (just like Linus Thorvalds and about 10% of the people of Finland) and the name "My" was more or less invented by the also Swedish speaking Finnish author Tove Jansson in her books about the Moomins.
Although wikipedia claims it to be a short form of Mary or Maria, which I seriously doubt.
Anyway, Finnish has not that much to do with the name, I think that it allegedly was Tove Janssons uncle - a professor in mathematics, that suggested the name based on the mathematical symbol and Greek letter μ pronounced in Swedish. It might even be vaguely similar to how it was pronounced in ancient Greek :-)
Besides, I think that "My" in Finnish would be spelled "Myy" since just about the only simple thing with the Finnish language is that the vowel length is indicated by the number of characters.
Maybe Monty should upload an .au file somewhere saying "Hello, this is Monty Widenius, and I pronounce MySQL as MySQL"!
Three of them being:
Edit: did not see the typo at first though the word seemed longer than usual...
More than that, within PoststgreSQL I get the sense that while there are serious discussions, all parties try to understand alternative positions and resolve conflicts instead of letting them brew.
Nice and clean post showing that acknowledging a weakness isn't a terrible choice.
I do wonder whether a different data structure would have mitigated the issue instead of transitioning to a different storage engine.
I would have tried solving it by loading a dual E5v4 server full of 3TB and a slew of SSDs for L2ARC+ZIL under ZFS: more SSDs > bigger SSDs because the absolute worst case SSD performance that any and all SSDs suffer from is random reads (not writes) can slow to 200MB/sec even on those insanely fast "enterprise" PCI-E SSDs that do 2-4GB/sec continuous reads.
Given that, there are only four dbs worth using: Postgre, Oracle, DB2, and MS SQL. Unless you're doing something that is truly not suited for SQL (or SQL is insanely overkill), or you're Google and have a database in the hundreds or thousands of TB and literally have to write your own database (they went from almost inventing mapreduce, to going full circle back to full scale distributed RDBMS SQL with F1, which I wish they'd open source), any other database is just going to be a pain in the ass, buggy, and full of gotchas and undiscovered corner cases simply because it doesn't have over a decade of development and millions of users behind it.
Also, I have not included MySQL for obvious reasons: fun for toy SQL DBs where sqlite isn't a good fit, but not for enterprise use by any means. Uber switching to MySQL over Postgre is rather scary, I wouldn't want to be a Uber investor right now.
Yes, I'm aware there are systems for MySQL to handle failure, but I'm also aware of the systems for Postgre, and Postgre's failure handling seem to be far saner and easier to recover from. Defense in depth against failure is easier in Postgre from my experience.
This would be like Elon Musk blogging about how "oh yeah, sometimes the brand new Tesla factory shuts down completely because sometimes the power goes out; but we're using really popular well known power distribution systems, so we're industry compliant, so everything is okay."
If Elon Musk blogged that, HN would go apeshit, and rightfully so.
Postgreql is trustworth and predictable and engineered and engineerable. Its like a German union automobile plant press operator sitting down on the job and crossing his arms until the broken safety switch is fixed, which will take precisely 3.25 hours and cost $X while the resulting assembly line shutdown costs 1000 x $X. But it'll be safe and nobody gonna lose an arm. Your downtime and related costs are more or less predictable. Maybe not the highest productivity plant in the division, but nice safety record.
Mysql is best effort. The safety switch on the press breaks, redlining that machine and shutting down the entire plant. Hmm if I stick my arm in that 50 ton press while its operating, that'll hurt a bit, so lets just not do that. Dude's a real tryhard, which always ends like you'd expect. Of course safety regulations were literally written in blood so at some unpredictable time in the future you'll get a $1M personal injury lawsuit for loss of an arm and a $10M OSHA fine, and the plant will be shut down for the criminal investigation for a random indeterminate amount of time plus the interval required to remove arm from press. Your downtime and costs are completely unpredictable, but probably mostly over a very long term for many people on average lower than postgresql. The plant will have a higher productivity metric result, and also a worse safety record.
However there is an important point that philosophy doesn't matter when times are good. Its only when the tool is misused or there's a malfunction that the underlying philosophy even shows up.
It doesn't matter which system you use when you try to store Aug 1 2016 into a date column, but (at least in the old days) it was very interesting trying to store February 30th into the databases. Insert anyway with a warning? Round up, down, or stick in a null? Normalize it to being March 2nd ish? Insert fails completely with an error? This has varies with time and configuration but in a "general sweep of history" manner you can guess correctly most of the time what each DB does.
Also there's nothing wrong in any way with a critical system that drops into philosophical best effort mode during a crisis rather than paralytic halt mode. Well, there's nothing wrong with it as long as the system was engineered with that in mind and neither the dev nor ops people are surprised by that behavior. Sometimes that is the right thing to do.
https://surge.omniti.com/2015/images/presentations/MattRanne... (Slides 39-63)
I'd like to have a source on that, would help shutdown a lot of MySQL discussions if true.
A typical takedown would be the likes of: http://grimoire.ca/mysql/choose-something-else (which also touches storage engine configuration things that are easier to defend against by an experienced organization). Unfortunately this sort of takedown solves very few discussions.
There are also tons of hidden gotchas that exist in, for example, the query planner. It can be extremely fickle and suddenly switch from a performant query plan to a terrible one that creates unindexed temporary tables and sorts them or joins against them. Or just ignores relevant indexes in the tables whatsoever.
Everything hums along fine until a random INSERT or UPDATE causes the query plan to change, bringing down your entire site. To be fair, such a problem can happen in any DBMS but I've never experienced it with Postgres to the extent that I have with MySQL.
Anyone with a huge production database running under load is going to have ways of mitigating these problems. Tumblr manages with MySQL, they open-sourced some of their tools like JetPants (https://github.com/tumblr/jetpants) to help build huge datasets.
So maybe Uber made a call and said "we can deal with intermittent corruption problems, we can recover from those, so long as the performance is better because a reputation for being slow is something we can't recover from". Life is all about trade-offs.
Regardless, MySQL by default silently eats data in common situations (truncation of VARCHAR) and returns flat-out incorrect results due to PHP-style "helpful" coercions (SELECT 0 == "banana"). It implements UTF-8 incorrectly, but fixing it would break existing apps, so we're forever stuck with "utf8" encoding that isn't.
There are a million more of these, and while some of them have workarounds (strict tables, utf8mb4), many of them don't (automatic coercion, boneheaded query planner, creating implicit temporary tables without indexes even when present, etc.).
A comparison of MySQL to PHP is apt, honestly. The fact that PHP is a blight doesn't mean other languages don't have their own problems. But PHP (like MySQL) is in a league of its own here.
The by design part is referring to early versions of mysql and discussions around it purposely did not care about ACID. Speed was the number one driver.
We are now investigating switching to MariaDB instead. (I'd personally love to move to Postgres, but that's not likely to happen any time soon)
This, in addition to the fact that index merging has been broken in MySQL 5.6 for more than an year now (in some cases it will cause empty resultsets to be returned), and that it is still broken on MySQL 5.7
"The default SQL mode in MySQL 5.7 includes these modes: ONLY_FULL_GROUP_BY, STRICT_TRANS_TABLES, NO_ZERO_IN_DATE, NO_ZERO_DATE, ERROR_FOR_DIVISION_BY_ZERO, NO_AUTO_CREATE_USER, and NO_ENGINE_SUBSTITUTION."
I've been running a few MySQL setups for a decade, why do you think it is not a good fit for enterprise apps? InnoDB solves most of the previous limitations. I also think MySQL is slightly easier to deploy/scale than Postgres. I believe Facebook is still running MySQL for instance.
You may be interested in CockroachDB, which was inspired by Google's F1 and Spanner implementations.
Another contender was FoundationDB, but Apple bought them and stopped selling it as a product.
No, it was not. FoundationDB's SQL layer was a joke. The system was horribly slow. Every DBA that I talked to that tried it said that it did not deserve the hype that it got.
I guess you wouldn't have wanted to be a GOOG investor either? Most of their early revenue was from AdWords on MySQL.
Google is maybe the only company I trust to know what they're doing when it comes to databases.
You can armchair engineer all you like, but at the end of the day, the proof of the engineering is in the working.
And if you have Uber stock you'd like to sell just because you think they're making a questionable engineering decision around databases (despite all the real world evidence that it is adequate to serve large scale businesses), I know a few investors who would gladly take it off your hands.
Whether you /can/ coerce it into doing the right thing is not the issue, Uber are doing their own thing regardless of if its right anyway so they're unlikely to change in that regard- what makes you sure they will do MySQL right at all?
I could be wrong but most databases are pretty competent with this style of workload.
It's true that the large majority of FB's mysql fleet is dedicated to the main product data, essentially an object-graph store with a restricted access pattern. The queries are indeed relatively simple. I can't remember for certain but I think that simple joins and transactions are actually used there.
However, the rest of the fleet supports an extremely diverse set of workloads -- keep in mind that MySQL is the primary data store of Facebook, and this includes product, ad serving, payments, async task persistence, internal tooling, many many other things. Countless different query patterns are in use. And although this is a minority of FB's mysql fleet, it's still many many thousands of machines supporting these other workloads, substantially larger than the vast majority of Postgres installations in the world.
(Source: I worked on MySQL automation at FB, and was lead dev on their RDS-like DBaaS, used by a large portion of the company's engineering teams in one way or another.)
I will readily admit that Postgres is a very good database, and definitely a better choice for OLAP workloads than MySQL (which has problems with complex queries) / InnoDB (problems with long-running read queries impacting old-row purge behavior). However, for extremely high-volume, large-scale OLTP workloads, MySQL/InnoDB is absolutely an excellent choice, imo better than Postgres for reasons of performance, ecosystem, and number experienced engineers who have worked at extreme scale.
We jumped onto the 2nd gen instances even when they were in beta.
I always ask is there a need for absolute correctness, and would this be better handled by a batch processing method.
I'm not sure any hardware would have solved their issue though. It may not have actually been disk speed, but issues in the mvcc design itself.
This is total bullshit. Even a modest size database (3-5 terabyte) for a small company like mine, chokes a relational database bigtime on time series. Especially the modern use cases where you are ingesting fast. You need Cassandra or Hbase which are very serious pieces of technology and are definitely "worth using", and will crush any of the four relational dinosaurs you mention in that space at comparable cost, and with embarrassingly better scalability. And by the way, that space is not some corner case, it is probably the major growth area of the next computing era.
You don't need to give up proven relational databases for this web-scale bullshit. Financial companies have been doing hundreds of times “modern use cases where you are ingesting fast” for decades. KDB+ is the darling of that industry, but DB2 is pretty popular (IIRC) as well as some other niche options.
Right tool for the right job, basically. I have nothing against Cassandra or HBase, and they're both quite impressive pieces of engineering. But they're not an alternative to relational databases; rather, a complement for very niche use-cases. I see NoSQL stuff get used for “performance and scalability” in places where a columnar relational DB would scale just fine and bring the benefits of a relational database. I'm all for HBase for non-relational use-cases, but those are rather more rare than people seem to think (and end up with an ad hoc relational model because of it).
There's also Apache HAWQ (incubating), which takes Greenplum's SQL parser and distributed query planner to use as front-ends for Hadoop.
Disclaimer: I work for Pivotal, which opensourced these systems.
Otherwise, this would have been a good comment, I think.
(I'm most familiar with SQL Server; DB2 and Oracle apparently have similar functionality.)
Apparently their 3-5TB time series table chokes ‘dinosaur’ relational databases. For a time series table you should almost certain be using a column-store index with CREATE CLUSTERED COLUMNSTORE INDEX. Depending on how much they're querying vs inserting, that alone could prevent SQL Server from choking. If it's more insertion-heavy, but they still need to run a lot of ad-hoc queries, SQL Server 2016 supports using a nonclustered column-store index alongside a row-store. You can insert with the row-store and query with the column-store. If it's still choking with that, it might be time to check out the database structure and see what's up. One of the more common killers is putting a lot of data in the time series table. Generally it's more efficient to have a clustered column-store index table that contains the timestamp and ids to metadata, and keep the metadata itself in row-store tables with appropriate indexes.
Separately it is absolutely correct that a comment that asserts that 4 relational databases are the only serious ones is completely misguided and deserves a robust rebuke.
Technology moves fast. Very fast. If a piece of tech gets old enough to be called a dinosaur, and is still run at brand-new companies, it's doing something right, even if it's not for your usecase.
And the fact is, not all of us have 3-5TB time seriesdatabases. And time series data can cause a lot of DBs to choke.
The point is, I'm glad Cassandra and Hbase, or a dedicated TSDB, or whatever, work for you. Your usecase is not the same as the rest of us, and assuming that it is is causing you to come to some incorrect conclusions.
lol right, with MongoDb, Map Reduce is a joke, GridFS is slow and barely usable, the storage is extremely inefficient, the "query engine" slow, and don't get me started on their "full text search" engine. MongoDb is a successful marketing stunt in the "Nodejs era".
This has also has been stated on their download page for 32-bit binaries as well.
Note: I hate Oracle DB, but I must work with Oracle 9/10/11/2 DBs because is what our clients have.
There was also the issue where errors were either logged or not, based on the result of Math.random(): http://stackoverflow.com/q/16833100/1233508
Let's get some statistics. MySQL and Postgres both have and undefined max DB size, and max table sizes upwards of 16 and 32 TB, respectively.
SQLite has no max table size I could find, but had a max DB size of 140 TB. 140TB. And this is from a database that reccomends not using it if your data grows too large.
Reading between the lines: "let's ignore the obviously suboptimal choice of architecture and concentrate on the DB specific issues addressed".
OTOH, what would be a nice way of doing CQRS completely within postgres?
This is a key take-away for me. I used mySQL extensively and switched to postgres for everything years back. I would need extremely good reasons to use mySQL again.
What you get with postgres is a lot more consistency and peace-of-mind, in my opinion.
I think we will at some point. That's the primary original use case for a lot of NoSQL, and the reason Twitter had so much trouble with relational databases.
But these are cultural understandings, and those move slowly. Also, we're poorly (collectively) equipped to handle subtlety in these discussions, so mostly we're trying to move from "relational databases are perfect for all use cases" to "NoSQL databases are perfect for all use cases" -- which is even less true, not more true.
Culturally, this is a hard thing to keep in our collective brain.
But at scale, weird partial failures, results as they are found and eventual consistency are usually preferable to a really long DB query that never returns.
Above a certain scale, big joins are simply not usable. NoSQL's manage-it-yourself approach is good for getting partial results where full results are too expensive.
You can think of it as applying a heuristic approach where an exact approach is too expensive, if it makes you feel less like NoSQL sullies the purity of databases :-)
Yes, that's SQL, but the term nosql was always orthogonal to the priorities of the movement (horizontal scalability).
Their exact use-case is what things like Cassandra were built for - insanely high writes / updates. They're also built to split your load across N systems, as long as you're still using a monolithic database (even with read replicas) you physically can't get the same performance that you could with one of the NOSQL distributed systems. YMMV for specific performance, I've seen a huge MySQL burn through queries like butter and a Cassandra cluster crawl on a tiny data set.
The second part of your question is the real crux of the problem - "heavily indexed and used heavily in joins". Neither of those (RDBMS or NOSQL) work well in either scenario. This article indicates why PG isn't great, the Uber article indicates some of the (minor) downsides of MySQL. They're already using Schemaless to bend their RDBMS into a fancy key-value store, so they're halfway to using a real one with their home-built indexes already. For NOSQL you generally don't get joins, unless you write them yourself. You also end up with manual "write amplification" since you denormalize, write the data 10 times to index it 10 different ways. You can be smart about it so it's not exactly 10x, but you'll end up with more than your original problematic throughput, albeit spread across more systems.
This would almost certainly be true if the design were widespread (which it's not as far as I know), but it isn't necessarily true for all cases.
I think it would be better framed as an optimization problem. If you design for a domain that actually models 500 events per second in a dataset of 50K items, the simplest correct implementation will implement exactly that. If that domain also involves reads which benefit from joins and indices that make those writes prohibitively slow, you have a conflicting set of optimization paths. The fact that some tools don't accommodate that well is an implementation detail, and addressing that fact is optimization, not necessarily a primary design consideration.
Provides a link to the rationale post from uber (https://eng.uber.com/mysql-migration/), and a tl;dr of it. Prior discussion here: https://news.ycombinator.com/item?id=12166585
Also I'd like to take this moment and address those people, who will start rubbing this "Uber switched to MySQL from PostgreSQL" argument without considering that not every app is "Uber". You can't simply take their use case and start throwing stuff against PostgreSQL.
imho some of the glaring shortcomings of the technology you refer to can be traced back to the very issue that you are highlighting.
In every single performance tuning a scale story that I've read over the past decade, the very first point of order is: remove joins from high traffic queries. It seems like Uber has gone the complete opposite direction.
Joins start getting particularly tricky when you do expensive stuff like filter over the correlated results of many tables. But in those cases, there's also no easy alternative: you'll need to redesign the way you store your data, if possible.
And it's a hard problem. Twitter spawned a huge wave of NoSQL to deal with this, and it's still not really dealt with.
Sessions are vital for any product that works better with logged-in users.
I also wonder what happened the last few (10) years. When I was in university I'm pretty sure I learned that JOIN was Satan's mother and if you have a big DB you need to avoid JOINs as much as possible. That's not a big deal today anymore, it seems.
Unless you already have your entire database in-memory in your app, that is. In that case, why do you have a database?
A more realistic example is, do you get Alice's pets by doing a JOIN on tables Person, Pet, PetOwnedByPerson ("SQL") -- or by having an array column "pets" in Person? ("NoSQL")
"Normalize until it hurts, denormalize until it works."
There's actually two competing philosophies on data warehousing (Inmon and Kimball), but I've only ever used Kimball's method, which favors denormalization.
But yeah, usually, don't be clever and don't spaff the contents of the database across a network just to do a join.
I would generalize what you say even further: The database is faster at most of the data crunching you need.
Interestingly, they were also called UBER TECHNOLOGIES LTD and UBERTECHNOLOGY LIMITED at some point in time, but these companies are now dissolved.
The real company in Luxembourg.
It's like saying Google is just an ad company that "happens to use tech".
Whereas Uber is all about their client app, GSP tracking, etc. Without that they wouldn't be Uber but a large taxi company.
Without Uber technology, Uber would not exist - it would just be Uber Taxi Inc., a perfectly ordinary taxi company indistinguishable from all others.
Whether it'd largely avoid the ire of law enforcement if they couldn't hide behind their tech is a different matter.
The main difference between Uber and taxi companies is that Uber fully utilized modern technology from the ground up to build the platform, and as a result they were able to realize major improvements in terms of speed, reliability, and cost-effectiveness.
In EU, they're considered what they are: nice gimmick to shift company cost to workers, and taxate workers salary with 20% tax, while having (on company side) fixed cost: https://www.jacobinmag.com/2016/07/uber-drivers-app-rideshar...
it makes me wonder, though: if I had a table with 50k rows, updated hundreds of times per second and used in joins throughout the database, is there any way I can just stick that whole table into memcached or redis? I know there are some cases where this works, some where it doesn't. curious if this option was explored.
A relational db is originaly meant for fetching data from a slow storage to a fast one, and the other way around, in a smart way. Doing it 500 times a second isn't a scenario for any db.
Build your own in memory data + process tructure, maybe using something like and agent network and using eg akka or erlang for failovers.
I'm curious as to why uber had to rely on a relational db for this case...
As you say though, put it in memory first and write it out to a DB every now and then.
 Source: https://momjian.us/main/writings/pgsql/hw_performance/
this seems like an annoyance, not a time-for-a-new-database
kind of problem.
Primary Key is essentially syntactic sugar for not null and a unique index.
Additionally, I can't quite recall whether this is the case in MS SQL (since it uses pessimistic locking by default, not snapshot isolation, which has different performance characteristics), but in most MVCC architectures there's the additional problem that the underlying row value could have been changed concurrently with your query (e.g., deleted or modified). While this might not seem so bad for simple row-level lookups, this gets much more problematic if you are doing something like a range query, where the index might contain rows that weren't in the database at the beginning of your snapshot. There are a variety of ways of dealing with that problem, but most of them involve increased write traffic (index sizes get even more bloated because now they need versioning information), increased read traffic (index reads that touch out-of-date rows may have to follow an undo pointer, requiring the extra seeks you were trying to avoid in the first place), more locking (for instance, you could lock the entire index when a transaction modified it to keep it consistent--but that would decrease concurrency--or you could try to do range locking--which can lead to increased probability of deadlocks and also decreases concurrency and increases contention on the lock manager), opportunistic optimizations that only work on mostly-immutable data (Postgres and HyPer's solutions is to maintain a much smaller visibility map with bits indicating whether it's safe to assume the index is unchanged), or giving up multi-key read consistency (I'm assuming if this were an option you would be using another storage engine, because lots of them can perform unbelievably well if that restriction is relaxed!).
My point being, there's no such thing as a free lunch. Personally, I'm usually extremely happy to give up on pessimistic locking for the concurrency benefits of MVCC, and index utility decreases sharply as it gets larger, but as with many other things it entirely depends on your workload. Two-phase locking actually works far better than MVCC of any sort under heavy contention with short transactions (what Uber is apparently doing), so they really probably should have investigated SQL Server or another database optimized for pessimistic concurrency control.