Hacker News new | past | comments | ask | show | jobs | submit login

"our idea is that developers should know about details such as schemas and indexes (the Parse engineers strongly agreed in hindsight)."

Many engineers I have worked with like to throw around terms like: "CQRS", "Event sourcing", "no schema's", "document-based storage", "denormalize everything" and more. However, when pushed, I often see that they lack a basic understanding of DBMSes, and fill up this gap by basically running away from it. For 95% of the jobs, a simple, non-replicated (but backed-up) DBMS will do just fine. The remaining 3% might benefit from some sharding and the final 2% are for specialized databases, such as column-oriented DBMSes, document-oriented DBMSes or graph-based databases.

At this very moment, I'm working on a classical DBMS, searching through around 200 million tuples within the order of 2 seconds (partially stored columnar using psql array types and GIN indexes). Not for a second did I consider using any kind of specialized database, nor optimized algorithms, except for some application-level caching. Some problems are caused by DBMS to backend and backend-to-frontend network utilization, which is generally not something a specialized DBMS can solve.

You are assuming that 95% of the jobs will remain the same throughout their entire lifetimes. Which is not true and I'm sure many would prefer for a database not to become a liability in case they happen to attract a lot of users, which requires more availability than a classical DBMSes can provide.

The problem is more about constrains required to scale and achieve high-availabily and low-latency, rather than a database choice. But classical DBMSes lack those constrains and you may and probably will end up with a design that just cannot be scaled without removing and redesigning a lot of things that users rely on, and also with engineers that never learned how to do that, so you may end up in an urgent need of a distributed systems specialist too. Scary. You can avoid that by using systems designed for high-availability from the very beginning, like Riak. Same goes for blobs and filesystems, using POSIX API instead of an object storage is going to haunt you later on, because in your designs you may end up relying on things that ultimately cannot scale well and guaranty latency, as gitlab learned the hard way.

I'd say YAGNI applies here since 95% of systems will never need to scale beyond the capabilities of a properly-configured RDBMS. Ensuring consistency and durability with a non-relational, distributed datastore generally requires a lot more detailed understanding of its internals than with a RDBMS, so choosing one of those rather than a RDBMS is going to impose costs that should be deferred until they're absolutely necessary.

Plus it's waaay more likely that bugs in application code will haunt you even more. Especially the lack of things such as foreign keys and other constraints.

Schemas can be straight jackets, but in many situations they're actually closer to life jackets.

Totally agree that schemas are lifejackets. Carfully modelled data can outlive the application it was originally written for.

Poorly structured data, on the other hand, where the schema and consistency is enforced in code is much harder to migrate.

Maybe our problem is that we write systems with a disposable point of view? We don't even think the system will be around long enough to require migration.

I think people underestimate just how far a properly-configured RDMS can scale. A machine with a couple CPUs and a few hundred gigs of memory can do a lot. Couple that with some tastefully-applied caching and query from a read replica when you can tolerate stale data... You can scale pretty far.

And don't forget sharding. If your load is very much user focused.. i.e. user A only cares about the items for user A within the DB... you shard by user_id, and can now host 10 billion users on a farm of RDBMS.

On the other hand, it gets more and more operationally complex as you add shards and failovers and such. If you KNOW you are going to hit that scale (say you are an established company with 100 million customers), it may not be the right choice.

Right. I'll remember that next time I'm an infrastructure lead at a company with a 100 million customers ;)

But thanks for pointing out sharding. It can be powerful strategy.

Once you are at a 100 millions customers company, the urgency becomes to ban all these MySQL and postgre for production systems. Or you need to figure out how to seriously handle a multi master setup in production.

There are the 80% bottom of companies who are just toying around doing small products and MVP, that should stick to what's easy.

There are the 5% at the top who are guaranteed to get 1 million customers no matter what they ship, that should stick to doing things right and sustainable from the start.

Contrary to what the other comment state: sharding and failover are ALWAYS the EASIER choice at scale. The alternative is to be unable to change hardware/systems and to be forced into 15 hours downtime window for every maintenance/upgrade operation, which is a huge liability when you have real users.

The 2 types of companies have opposite priorities.

Use cases will vary, but where I work we've scaled a simple MySQL master/slave setup + memcached for some caching up to around 5 million active users per day. The largest table we regularly touched (no caching) had just over 1 billion rows. Most projects will never reach this scale.

You may be surprised how both extensive and fine grained the constraints you mention can be with current high-end RDBMS. And yes, that includes stuff like high availability and blobs/filesystems. (via naked POSIX API? Why the hell would you do that?) Though - cheap shots be damned - in my experience "high-end RDBMS" don't include (Non-Enterprise) MySQL or - to a certain extent - Postgres.

> Some problems are caused by DBMS to backend and backend-to-frontend network utilization, which is generally not something a specialized DBMS can solve.

I absolutely agree. RDBMSs are the kings of one-size-fits-most. In the NoSQL ecosystem, MongoDB starts growing into this role. Of course, many performance problems are not caused by the backend but rather the network (an average website makes 108 HTTP requests) or the frontend (Angular2 without AOT, I'm looking at you). We took an end-to-end approach to tackle this problem:

1. As objects, files and queries are fetched via a REST API, we can make them cachable at the HTTP level to transparently use browser caches, forward proxies, ISP interception proxies, CDNs and reverse proxies.

2. In order to guarantee cache coherence, we invalidate queries and objects in our CDNs and reverse proxy caches when the change using an Apache Storm-based real-time query matching pipeline.

3. To keep browser up-to-date we ship a Bloom filter to the client indicating any data that has a TTL that is non-expired but was recently written. This happens inside the SDK. Optionally client can subscribe to queries via websockets (as in Firebase or Meteor).

The above scheme is the core of Baqend and quite effective in optimizing end-to-end web performance. Coupled with HTTP/2 this gives us a performance edge of >10x compared to Firebase and others. The central lesson learned for us is that it's not enough to focus on optimizing one thing (e.g. database performance) when the actual user-perceived problems arise at a different level of the stack.

I think the comparison to Firebase is a bit disingenuous. Their core offering is real-time synchronization of large, arbitrary data structures. From perusing Baqend briefly, it seems like your architecture is much more poll-based.

Hi, I'm the real-time guy at Baqend and I'd like to object.

Baqend has streaming queries whose results stay up-to-date over time; purely push-based, no pulling involved. You just have to register your query as streaming query and receive relevant events or the updated result every time anything of interest is happening. In our opinion, this is not only comparable, but goes beyond the synchronization feature of Firebase, because our real-time queries can be pretty complex. A Baqend streaming query can look something like this: SELECT * FROM table WHERE forename LIKE 'Joh%' AND age > 23 ORDER BY surname, job DESC

We are going to publish this feature during the next weeks. But you can already read the API docs here to get a feeling for the expressiveness of the feature: https://www.baqend.com/guide/#streaming-queries

The streaming queries functionality seems to fill this need: https://www.baqend.com/guide/#streaming-queries ?

I work on Firebase now, though not the DB team anymore. I'd love more data on how the DB is being outperformed if you're willing to share. Was it write throughout? Read throughout? Latency? Were you using small or large object sizes? Were indexes and ACLs used in both cases? Are you sure your firebase layout used best practices? Have you run the test lately to verify whether the findings are still true?

Sure, the benchmark and the application under test are public: http://benchmark.baqend.com/

We also do have some benchmarks comparing real-time performance, which are under submission at a scientific conference, so we cannot share them yet.

Unfortunately, testing against Firebase using established benchmarks such as YCSB is not very useful, as it's unclear which limits are enforced on purpose and which limits are inherent to the database design. Would you be open to providing a Firebase setup for heavy load testing w.r.t. to throughput, latency and consistency?

Wow, the presentation here is really polished. Kudos!

For any readers, the test AFAICT is time to last byte for a single reader. It's a bit unfortunate that it seems local caching is affecting the bagend result by the time you switch the comparison to Firebase (though in full disclosure, Firebase's local cache isn't available on JS so that's a bit on us too).

I could give a few perf tips on the data layout for Firebase but it isn't clear whether the toy app has intentional indirection to model something more real.

I wanna say he's actually talking about you when he talks mentions developers who over complicate tjings

> At this very moment, I'm working on a classical DBMS, searching through around 200 million tuples within the order of 2 seconds (partially stored columnar using psql array types and GIN indexes).

hi, this comment was timely. I have been asking myself what to use to persist a 10 million node graph. There has been a lot of marketing around using Hbase (obscene to setup and maintain) and Cassandra (much better.. but paywalled).

My main db is postgresql and we use jsonb quite extensively. The problem is that there is a lot of statements that "a single huge jsonb table in postgresql will never scale. you will have made a horrible mistake".

Seems you have ample experience on this front - whats your take?

Depends heavily on your data access patterns. If you need eg recursive queries or pattern matching, you might be better served by a graph database.

hi sandGorgon. I agree with mhluongo on this. If you want to do things like transitive closure or connected components, you fall in the 2% section above. However, if you want to encode graphs for message systems, a simple table with parent-id might do just fine. Wrt jsonb: if you only do key->jsonb tables, you might want to consider another kind of system. Nevertheless, we've got millions of trajectories, connected to millions of annotations (links to trajectories with semi-structured jsonb), and combined with some GIN indexes, it works pretty well.

It seems like you're suggesting that "classical" DBMS are inherently relational. Pardon me if I've misread you but, respectfully, as awesome as relational databases are, you haven't offered any reason why they should be your default versus some kind of NoSQL store. Insofar as your concern is simplicity, which you say you prefer (understandably!), it's actually just as likely the simplest option will be non-relational. There's also nothing inherently specialized about a non-relational store; in particular, hierarchical and key-value stores have at least as much maturity in production as relational ones (cf. IMS and dbm, respectively).

(Sorry for the pedantry, but I both work on a cloud document-oriented database and recently watched a bunch of collegiate developers struggle to set up MySQL when their queries could easily have been handled by a simpler system.)

The following reasons come to mind wrt RDBMS:

- When properly normalized, future modifications to the relational model will often be much simpler. Transforming hierarchical data stores or document-based ones often involves changing much of the code which calls the database

- Query optimizers are often performing work which is difficult to implement yourself.

- When properly called, query composition allows more computation to be done close to the data.

- Knowledge of RDBMs is widespread allowing a low barrier to entry (is also a huge disadvantage, since it brings inexperienced people to use a sophisticated tool, leading to overgeneralized bad press)

- Index management in relational model is often simpler and logical

- Many RDBMs have a proven track-record of stability, code quality and feature completeness. Take a look at PostgreSQL, for example: GIN / GIST / PostGIS / date-time / windowing functions / streaming results. Also, the extensibility of postgres is astonishing, the community is warm and welcoming to beginners.

Obviously, I could make a similar list of disadvantages of RDBMs. It's just my experience that the advantages mostly outweigh the disadvantages.

I'd argue that the power of a classical relational database makes it easier to handle for someone without in-depth knowledge of databases than many NoSQL databases.

Of course, if all you need is to look up values by keys, a key-value store is a much simpler solution. But the moment you need more complicated or just different kinds of queries than you expected, NoSQL can be rather difficult. You can get a lot of complicated queries done with mediocre SQL knowledge and a general-purpose RDBMS, but trying to use a more specialized database outside the intended use can be pretty painful.

I don't have all that much experience with NoSQL databases, but my impression was always that it is more important there to understand the limitations of the database than with something like Postgres that is more forgiving. They might be simpler to get started with, but much harder to use correctly.

The reason most of the time RDBMS are a fit most of the time is because majority of data is inherently relational. It seems that the problem for people with using relational databases is that there's a learning curve to use it. This makes NoSQL much more appealing at first, but once you start using document store and unless your data is trivial (everything can be referenced with a single key) you will realize how hard it is. Sure, NoSQL is easy because you don't have schemas, relations, transactions etc, but because it doesn't have all of that now you have to implement this in your application which is not an easy task.

Another thing (which often overlooked) is that because relational database typically matches data model better your data will take less of space than when stored in NoSQL. With Postgres there's also the extensibility.

I learned this myself, we had a database that stored mapping of IP to a zip code and latitude/longitude to a zip code. The database was on MongoDB, but it wasn't too big (~30GB) the thing is that to maintain a performance all of the data needed to fit in RAM.

I was a bit curious and decided to see how Postgres would deal with this. So I loaded ip4r extension (to store IP ranges) and PostGIS to enable lookups on latitude/longitude. All the data fit in ~600MB, and with proper indexes postgres returned all queries instantly (sub milisecond).

The Mongo was running on 3 c3.4xlarge instances, but postgres would be fine on smallest available one and would still be able to cache whole database in RAM.

IMO, because you can use a RDB as a K/V store, but you can't use a K/V store as a RDB. And I have yet to personally work at a scale where a sharded RDB was incapable of handling the load.

The big relational DBs scale well (near-terabyte with thousands of queries per second) with little maintenance. They scale ridiculously well (terabyte+ with a million+ queries per second) with more maintenance. Most loads simply don't require the scale at which RDBs start to actually falter at.

> At this very moment, I'm working on a classical DBMS, searching through around 200 million tuples within the order of 2 seconds

Just curious, does this kind of performance require dedicated hardware or is it possible on platforms like AWS?

This is on a AWS RDS instance, everything is in (DBMS) cache, db.m4.large.

You should revisit your schema. That sounds much too slow.

    q)a:(200000000?200000000),-42 /pick some random numbers
    q)\t a?last a
303msec for a linear scan of 200m ints on an i5 laptop means you should be able to do better than 2sec for a "smart" query.

Well, the schema is a bit more complicated than that. I'd say the 100M / s query rate is sufficient. Here we search for trajectories (road segments + time) going through certain regions within a certain repeating time-interval (say, 2 AM on a Sunday), in a 3 month time interval.

It seems slow to me too...

We have tables with > billion records In RDS / SQLServer and return results in <1 second...

Although it sounds like the OP might have some tricky queries, just want to echo that relational databases can be (very) fast if you know how to index them: one of our RDS MySQL tables has 3+ billion rows and queries on it average ~50 milliseconds.

And that's for a sequential scan over all of those tuples?

That'd be very impressive. Does SQLServer support parallel execution? Or how come it be so fast?

> Does SQLServer support parallel execution?

For a long time as far as I know?

Cool. I'm not familiar with it, that's why I ask :).

Anyway, is the performance you mentioned for a scan over the entire table? E.g. to do an aggregate?

Honestly don't remember, I'm working on new stuff in PostgreSQL now.

I know a count of the whole table takes about a minute tho. But the filter > group by was ~1s as the grouping was done on about < 50 rows of the filtered result.

Without knowing the original query above was using it's speculation on why it was slow and mine was fast. My argument is just that SQL Server isn't magically slow. MySQL/PostgreSQL/SQL Server are super fast. And if you don't massage the database it can be super slow too.


> I know a count of the whole table takes about a minute tho. But the filter > group by was ~1s as the grouping was done on about < 50 rows of the filtered result.

Yeah, that makes sense.

I'm running some queries in production that take a 150 million row table, filter it down to ~100-300k rows which are then aggregated/grouped. This usually takes < 1s. However, if I'd try to do a count(*) on the table, that'd be around a minute as well.

> Without knowing the original query above was using it's speculation on why it was slow and mine was fast. My argument is just that SQL Server isn't magically slow. MySQL/PostgreSQL/SQL Server are super fast. And if you don't massage the database it can be super slow too.

Yeah. The query / query plan will be needed to go more in-depth on these kind of discussions. The amount of disk vs memory hits during execution obviously as well.

Anyway, I just wanted to understand if SQL Server was e.g. an order of magnitude faster than Postgres when scanning a large number of tuples. But I guess the answer to that is: probably not.

PostgreSQL has a BRIN index's as of 9.6 which may make some forms of aggregation faster than SQL Server I believe. Would need to do real-world tests to verify that tho.

Just to clarify, the whole db fits into 8gb of ram you're saying?

Why non-replicated? Do you mean non-sharded?

I actually meant clustered. Replication is of course a good idea, but distributed query processing, esp. in combination with transaction processing is often too complicated to get the job done.

I agree, but how do you argue for that?

I work with a customer where every application has to have a dedicated environment, running on VMs in a seperate VPC on a private cloud. The default DB is a galera cluster with 3 nodes, even for the most simple application.

The argument for it is that cloud VMs are expected to fail, so everything has to be setup to tolerate instance failure without downtime.

All I have to counter it is the supposedly higher risk of data loss in a complicated split brain, the unspecific "more complicated" etc. While the risk of instance failure is very real, that feels a bit theoretical and vague.

> how do you argue for that

0 downtime and 0 data loss is impossible, both theoretically and practically. Even a Galara cluster is going to have downtime if a node fails (timeouts expiring, etc.).

Really, it comes down to what trade-offs is the company willing to accept: the potential for stale reads for high availability, slow/aborted writes for high consistency, downtime for reduced complexity, etc.

A few notes about Galara: its performance going to become severely degraded if one of the nodes decides it needs to be rebuilt - both that node and the donor node will effectively be out of the loop, with the network link between them saturated for the duration. That degraded state doesn't require the node to go down, it can happen spontaneously.

Galara also imposes limits on write frequency and location - if you're doing many writes, you don't want to split those writes between nodes, since every write has to look at every other pending transaction on every node before it can occur.

An automated RDB master failover can occur in under 30 seconds, easily. You can also typically run master-master without a risk of split brain by specifying even and odd primary keys.

> cloud VMs are expected to fail

Yup, but that timeframe is usually on the order of months or years. And if you use externally backed disk stores (at the cost of read/write speed) you can be back up in a matter of minutes with a single master. Even a full restore from backups can typically be done in 10-15 minutes; constrained mostly by your network speed.

Close enough.

"searching through around 200 million tuples within the order of 2 seconds (partially stored columnar using psql array types and GIN indexes). Not for a second did I consider using any kind of specialized database, nor optimized algorithms, except for some application-level caching. "


That's cool and all, but for some real-time applications, retrieving in the order of seconds is unacceptable.

DBMS has their earned place in the backend stack but I found that statement of covering 98% of the cases a bit of a stretch.

Of course, this is an extreme example. Most of our front-end is able to serve quite complicated data within 500 milliseconds. Some queries are over a longer time-span (3 months, 0.7 million truck-movement-legs, +/- 200M links), so a 2 second wait-time (including a spinner) is sufficient.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact