Hacker News new | past | comments | ask | show | jobs | submit login

No, it's because projects with huge amounts of data were growing beyond the limits of what you can reasonably do on one machine. Machines were less capable then (smaller disks, less memory), and those limits are a lot higher now. If your data is small enough to fit on one (very beefy) machine, then it's probably still cheaper to pay for that high-end machine vs. distributing to a bunch of less capable ones.

There are exceptions - distributing the data can be really helpful if you need to do a lot of bulk I/O (ETL jobs, analytical queries, etc.), but it comes at the cost of making "transaction" use-cases difficult and expensive. Using a scaled-up OLTP[1] database for user interaction and a scaled-out OLAP[2] database for analytics and ETL jobs is a common pattern.

[1] https://en.wikipedia.org/wiki/Online_transaction_processing

[2] https://en.wikipedia.org/wiki/Online_analytical_processing






Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: