
The Architecture of Schemaless, Uber Engineering’s Trip Datastore Using MySQL - danielbryantuk
https://eng.uber.com/schemaless-part-two/
======
danbruc
Color me skeptical, that looks like a pretty strange design to me, a database
on top of a database [1].

[1] [http://c2.com/cgi/wiki?GodTable](http://c2.com/cgi/wiki?GodTable)

~~~
jkovacs
Agreed, the design immediately reminded me of the following ancient DailyWTF
article, one of the few ones that stuck in my head ever since I read it. They
called it the "Inner-Platform Effect" \- might apply here.

[http://thedailywtf.com/articles/The_Inner-
Platform_Effect](http://thedailywtf.com/articles/The_Inner-Platform_Effect)

------
ivan_ah
I've been following these Uber engineering articles, and I think this is a
very neat architecture. Append only + boring technology = solid stuff.

I'm curious to know how many shards per storage cluster they use and how this
mapping is done. Is it fixed or can it change? I imagine a startup trying to
use a similar setup could start with a few storage clusters, but then add more
clusters as needs grow...

They say they use 4096 shards (presumably generated based on some part of
`row_key` which is the trip id), but I'm not sure this is a generally-
applicable strategy. e.g. if sharding in a social netowrk website is performed
based on `user_id` then won't be able to do joins across `user_id`s.

------
pbreit
Anyone want to weigh in on whether or not Postgres is a viable option for
this?

~~~
jrcii
Only semi-related, I've discerned a Postgres > MySQL/MariaDB sentiment on HN
for the last year or so. Is that just my imagination? If not, why is that?
MySQL in my experience is a very powerful RDBMS, maybe I just haven't run up
against its limitations.

~~~
austinhyde
There's probably a couple reasons, but I'll summarize my experiences with
them.

I haven't dug into MySQL for a few years (last I really worked with it was
5.5), but out of the box, it does a lot of things that are pretty unsafe or
encourage bad practices, such as truncating data when it's longer than the
column size, inserting the zero-value for a NOT NULL column rather than
erroring when a NULL is inserted, confusing timestamp column behavior, no DDL-
level transactionality, etc. Additionally, and not necessarily a bad thing,
but it has made some odd implementation and feature decisions that can be
(IMHO) counter-intuitive or have a large impact, particularly with regards to
how foreign keys get implemented, but also with things like not having schemas
(database > tables, rather than database > schemas > tables), not having a
boolean type, its datatype specification (int(1) means an integer that
displays only a single digit, rather than denoting storage sizes), or the fact
that every ALTER TABLE causes a complete on-disk table rewrite.

PostgreSQL, on the other hand, makes every attempt to keep 100% data integrity
at all times, has a lot of killer features (probably the best date/time math
implementation I've ever used, typesafe operators, etc), is generally very
extensible, and most importantly, is extremely predictable. True, it doesn't
have the same scalability features out of the box that MySQL does, but that's
getting better every release, and as mentioned elsewhere, there's plenty of
adequate third-party tooling available (e.g. Slony).

I think MySQL is very much so the PHP of the RDBMS world - it does a lot of
silly stupid stuff - mostly for historical reasons - but in the hands of
someone who knows how to use it properly, it can be an extremely useful tool.
Postgres just defaults to being an extremely useful tool out of the box
without needing to know all the gotchas that come with it.

~~~
morgo
> I haven't dug into MySQL for a few years (last I really worked with it was
> 5.5), but out of the box, it does a lot of things that are pretty unsafe or
> encourage bad practices, such as truncating data when it's longer than the
> column size, inserting the zero-value for a NOT NULL column rather than
> erroring when a NULL is inserted, [..]

This is no longer the default in MySQL 5.7.

------
forgotmysn
[https://news.ycombinator.com/item?id=10894047](https://news.ycombinator.com/item?id=10894047)

~~~
mdk754
Apologies if you're just linking that for reference, but this is a 2nd article
in the series.

------
bra2
Read that as a drunk saying shemales...

~~~
randompost
sober and i read the same

