Hacker News new | comments | show | ask | jobs | submit login

Well here's the thing: your typical web-oriented developer doesn't like SQL much. Probably because thinking in sets is very different from OO or whatever they're into. Most of them try to hide from it behind Rails/ActiveRecord or Hibernate or whatever. MySQL has a very basic, crude SQL and so doesn't intimidate them. Whereas to use PostgreSQL to its fullest requires really getting into set theory and relational thinking.

Plus if you look under the covers of MySQL each "table" is just a file on the disk. Unsophisticated database users feel comfortable with that too. "Sharding", another thing they like, is another attempt to evade using advanced features (not that partitioning is actually "advanced" these days, nor is a semi-decent query optimizer) that other databases take for granted.

Unsophisticated database user here.

Not only was I totally clueless that MySQL innards are represented on a one-file-is-one-table basis, because I'm an unsophisticated database user precisely because I don't want to know how my database works on the inside, I'm just sophisticated enough to know that any attempt to exploit the knowledge that the users table corresponds to a single file will result in my dog being assassinated by data corruption SQL ninjas.

How is sharding an attempt to evade using advanced features?

Sharding is purely a method to work around lack of scalability features in the database itself.

If you want to scale horizontally, how do you do it without sharding ?

There is a limit to vertical scaling and it becomes more and more expensive.

You don't need to sacrifice single-image to scale horizontally, and you haven't for years. Sharding was invented by IBM in the 80s and all the major vendors had abandoned it by the 90s.

database newbie here - I always thought sharding was the only thing to do after your DB starts choking on the volume. I did not know there was any other way to go about it. Care to share any information on how you would scale DB (PGSql maybe) - any google keywords would be welcome as well

So the MySQL camp would have you believe. But a) a real database running a balanced workload scales far further on the same kit than MySQL anyway and b) then you go with active/active clustering (e.g. Oracle RAC). I personally work on a 30Tb database like this and I've seen people take it to >100Tb.

A problem with shared-everything clusters like RAC is that they tend to scale poorly under write-heavy loads. Other problems would be the significant complexity, low predictability (esp. in terms of latency) and ofcourse the oracle tax.

It boils down to the old question of right-tool-for-the-job and a RAC cluster is not the right tool for most webapp scenarios.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact