Hacker News new | past | comments | ask | show | jobs | submit login

I don't know how I feel about this paper: on the one hand, I agree with the sentiment that the relational data model is the natural end state if you keep adding features to a data system (and it perfectly captures my sentiment about vector DBs) and it's silly to not use SQL out of the gate.

On the other hand, the paper is kind of dismissive about engineering nuance and gets some details blatantly wrong.

- MapReduce is alive and well, it just has a different name now (for Googlers, that name is Flume). I'm pretty confident that your cloud bill - whether or not you use GCP, AWS, or Azure, is powered by a couple hundred, if not thousand, of jobs like this.

- Pretty sure anyone running in production has a hard serving dependency on Redis or Memcache _somewhere_ in their stack, because even if you're not using it directly, I would bet that one of your cloud service providers uses a distributed, shared-nothing KV cache under the hood.

- The vast majority of software is not backed by a truly serializable ACID database implementation.

-- MySQL's default isolation level has internal consistency violations[1] and its DDL is non-transactional.

-- The classic transaction example of a "bank transfer" is hilariously mis-representative - ACH is very obviously not implemented using an inter-bank database that supports serializable transactions.

-- A lot of search applications - I would venture to say most - don't need transactional semantics. Do you think Google Search is transactional? Or GitHub code search?

[1]: https://jepsen.io/analyses/mysql-8.0.34






> The classic transaction example of a "bank transfer" is hilariously mis-representative - ACH is very obviously not implemented using an inter-bank database that supports serializable transactions.

This is meant more as a pedagogical tool rather than a literal representation of how the system works. The intra-bank aspects of ACH absolutely do rely on serializable transactions.


I would find it hard to question with CMU papers. They are a pretty thorough when it comes to computer and the tradition goes wayback. If I disagreed with something in this paper it would be a clue to myself that I need to understand the problem domain better.

I also think as a school the philosophy maybe to thoroughly solve the problem with out regard to speed because eventually computers will be faster.

I think the paper is just pointing out that anything great is going to migrate to the RM or SQL model. So that if you start there any feature missing will show eventually show up. They also point out how many resources go to deploying immature ideas.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: