
The AWS and MongoDB Infrastructure of Parse - DivineTraube
https://medium.com/baqend-blog/parse-is-gone-a-few-secrets-about-their-infrastructure-91b3ab2fcf71#.ctvj6t498
======
edejong
"our idea is that developers should know about details such as schemas and
indexes (the Parse engineers strongly agreed in hindsight)."

Many engineers I have worked with like to throw around terms like: "CQRS",
"Event sourcing", "no schema's", "document-based storage", "denormalize
everything" and more. However, when pushed, I often see that they lack a basic
understanding of DBMSes, and fill up this gap by basically running away from
it. For 95% of the jobs, a simple, non-replicated (but backed-up) DBMS will do
just fine. The remaining 3% might benefit from some sharding and the final 2%
are for specialized databases, such as column-oriented DBMSes, document-
oriented DBMSes or graph-based databases.

At this very moment, I'm working on a classical DBMS, searching through around
200 million tuples within the order of 2 seconds (partially stored columnar
using psql array types and GIN indexes). Not for a second did I consider using
any kind of specialized database, nor optimized algorithms, except for some
application-level caching. Some problems are caused by DBMS to backend and
backend-to-frontend network utilization, which is generally not something a
specialized DBMS can solve.

~~~
zzzcpan
You are assuming that 95% of the jobs will remain the same throughout their
entire lifetimes. Which is not true and I'm sure many would prefer for a
database not to become a liability in case they happen to attract a lot of
users, which requires more availability than a classical DBMSes can provide.

The problem is more about constrains required to scale and achieve high-
availabily and low-latency, rather than a database choice. But classical
DBMSes lack those constrains and you may and probably will end up with a
design that just cannot be scaled without removing and redesigning a lot of
things that users rely on, and also with engineers that never learned how to
do that, so you may end up in an urgent need of a distributed systems
specialist too. Scary. You can avoid that by using systems designed for high-
availability from the very beginning, like Riak. Same goes for blobs and
filesystems, using POSIX API instead of an object storage is going to haunt
you later on, because in your designs you may end up relying on things that
ultimately cannot scale well and guaranty latency, as gitlab learned the hard
way.

~~~
dhd415
I'd say YAGNI applies here since 95% of systems will never need to scale
beyond the capabilities of a properly-configured RDBMS. Ensuring consistency
and durability with a non-relational, distributed datastore generally requires
a lot more detailed understanding of its internals than with a RDBMS, so
choosing one of those rather than a RDBMS is going to impose costs that should
be deferred until they're absolutely necessary.

~~~
oblio
Plus it's waaay more likely that bugs in application code will haunt you even
more. Especially the lack of things such as foreign keys and other
constraints.

Schemas can be straight jackets, but in many situations they're actually
closer to life jackets.

~~~
gerbilly
Totally agree that schemas are lifejackets. Carfully modelled data can outlive
the application it was originally written for.

Poorly structured data, on the other hand, where the schema and consistency is
enforced in code is much harder to migrate.

Maybe our problem is that we write systems with a disposable point of view? We
don't even think the system will be around long enough to require migration.

------
schmichael
As an Urban Airship employee from 2010-2013 this is eerily familiar stuff. I
feel like we should all have a big "we tried and failed to use mongodb as a
distributed oltp database" support group.

One point of disagreement:

> However, they had an unspoken value system of not trusting their users to
> deal with complex database and architectural problems.

At least for enterprise customers (who actually pay the bills) this is the
correct value system. It's not that enterprise developers are bad. It's that
enterprise development moves incredibly slowly, so the more freedom and
options and implementation details you give them to work with the longer it
will take them to be successful.

What you as a startup engineer may feel like is a powerful and expressive API
just added 6-12 months to the development cycle of every enterprise
customer... and thus many will give up on the integration.

------
tbrock
Parse's MongoDB deployment was definitely not the largest by a longshot. Not
even top 10.

They had the largest number of databases on a single replica set for sure, but
largest in terms of cluster size or data size? No way.

~~~
DivineTraube
Do you have a source for that? It is what the Parse engineers told me, but
that kind of information is notoriously hard to corroborate. Any time I asked
this question to MongoDB engineers they were evading concrete numbers for
deployment sizes.

~~~
tbrock
I can assure you that you are incorrect. I am the primary source and very
aware of Parse's scale.

I worked on building a feature that allowed Parse to list that many databases
(it had response that does not fit in the document size limit of 16mb --
requiring multiple cursors, command cursors).

I am a collaborator on the parse-server project, love the Parse team, respect
what they built, and have no incentive to deceive you.

I worked at MongoDB for 4 years in various roles, consulting, engineering, etc
and saw all sorts of deployments of various shapes and sizes.

~~~
DivineTraube
I do believe you, but can your share some more hard facts?

I've changed the passage to "one of the greatest". As in the disclaimer above
the article, I'm mainly presenting assertions by Parse engineers, who were
very sure about being the largest MongoDB user in terms of cluster size.

------
bogomipz
I would like to hear more about the introduction of LSM-based RocksDB storage
into the Parse architecture. Specifically how long before they felt MMAP was
no longer meeting their needs?

I was also curious about this? >"Frequent (daily) master reelections on AWS
EC2. Rollback files were discarded and let to data loss."

Was the velocity of rollbacks sufficiently high enough that they were't able
to process them?

I thought Parse was a great company with a great product that served an
important market. I'm still not sure why FB acquired them only to retire it.

~~~
mkania
We had issues with the MMAPv1 storage engine from the beginning, mostly due to
it's lack of document level locking and storage bloat.

One of the biggest benefits from being acquired by Facebook was working with
the RocksDB team to come up with MongoDB +RocksDB, but this was also happening
around the time when Parse was starting to wind down.

Being the only company running this new storage engine freaked us the fuck
out. So we tried to blog and give talks as much as possible to get other folks
interested and willing to test it out.

Parse now running MongoDB on RocksDB:
[http://blog.parse.com/announcements/mongodb-rocksdb-
parse/](http://blog.parse.com/announcements/mongodb-rocksdb-parse/)

Strata: Open Source Library for Efficient MongoDB Backup:
[http://blog.parse.com/learn/engineering/strata-open-
source-l...](http://blog.parse.com/learn/engineering/strata-open-source-
library-for-efficient-mongodb-backups/)

MongoDB + RocksDB: Writing so Fast it Makes Your Head Spin:
[http://blog.parse.com/learn/engineering/mongodb-rocksdb-
writ...](http://blog.parse.com/learn/engineering/mongodb-rocksdb-writing-so-
fast-it-makes-your-head-spin/)

MongoDB + RocksDB: Benchmark Setup & Compression:
[http://blog.parse.com/learn/engineering/mongodb-rocksdb-
benc...](http://blog.parse.com/learn/engineering/mongodb-rocksdb-benchmark-
setup-compression/)

~~~
bogomipz
I thought you might say that. If you have an app that does a lot of deletes
that bloat becomes noticeable quickly. I think they might actually have
document level finally. Rocks is interesting as it deals with the write
amplification problem and its tuneable. TokuMX was another interesting storage
and had good compression. I would curious if you ever evaluated that.

Are you with Baqend now? Is this your medium post?

~~~
mkania
Before even looking into RocksDB we tried TokuMX and even got pretty well
aquatinted with their dev team. We ran into the same issues when testing Wired
Tiger. Neither could handle millions of mongo collections. Since we were
involved in the RockDB storage engine from the beginning, we made sure the
implementation could handle that many collections.

This isn't my post and I don't work at Baqend, but I will say that the author
comes off as a presumptuous asshat.

~~~
bogomipz
Yeah this reads like this person at Baqend is going over "lessons learned"
while at Parse. That's why I thought it might be someone from Parse. It seems
like bad form to write a blog post on what other people should have done
differently and use that to publicize your own company. Parse was a success.

~~~
DivineTraube
> So here are some facts and trivia that are not so well-known or published
> that I collected by talking to Parse engineers that now work at Facebook. As
> I am unsure about whether they were allowed to share this information, I
> will not mention them by name.

It's both lessons learned by Parse engineers and us, so I think the intended
ambiguity is okay.

~~~
bogomipz
>"It's both lessons learned by Parse engineers and us"

All of those bullet points are from Parse's history. Where is Baquend's
contribution to the "lessons learned" in this blog post then?

------
DivineTraube
Author of the post here. If you have additional key insights about the Parse
infrastructure, please post them here and I will directly add them to the
article.

~~~
koolba
What was the throughput (per server) after the Go rewrite? I'd imagine an
async rewrite would be a lot more efficent than the 15-30 req/s that the Rails
version was getting.

How often (if ever) did you experience data loss due to the MongoDB write
concern = 1?

~~~
inlined
I was the lead for Parse Push. I reverse engineered Resque on Go so we could
let any async stage be switched to Go cleanly. I added a few features like
multiple workers per Redis connection and soft acks that let the next job be
pulled by the framework but still kept the server alive during a soft shutdown
until the job completed.

After moving the APNs (v2) server to Go I was able to map the wonky networking
to Resque much better. My show & tell that week was literally "this MacBook is
now running higher throughout than our cluster of >100 push servers". For
APNs, the rewrite was a boon of over 500x per server.

As a great fringe benefit, fewer Resque servers meant less Redis polling. CPU
dropped from 97% to <10 IIRC, which helped save us the next Black Friday where
Redis falling over from Push load often caused problems.

I don't recall how often data loss was a problem. One of the points that
hadn't gone into a lot of detail was when certain ops tricks happened. For
example, I recall that reading from secondaries was reserved for larger push
customers only. These queries could be wildly inefficient, have results in the
10s of millions, and due to mongo bugs would lock up an entire database if
they had a geo component in the query. In later versions of Mongo where the
bugs were fixed and after another generation of the auto indexer we were able
to send all traffic back to primaries.

~~~
inlined
Oh. And our ResqueGo outperformed Ruby so badly that we couldn't do a
reasonable X% server transition from Ruby to Go by rebalancing the number of
workers. It had to be done my making a new Go queue and doing % rollout of
which queue work was sent to. We learned that the hard way once (though
luckily there was no regression IIRC).

------
chatmasta
I'm surprised vendor lock-in was not mentioned as a problem. I suppose it's a
fundamental problem with offering "BaaS" and since the author is pushing his
own BaaS offering, any critique of vendor lock-in may be unlikely.

I first used parse for a mobile app that grew to 600k users. I was totally
against adopting parse, for a number of reasons including the unpredictable
billing, failing background jobs, and the need for retry logic in every single
write query. But my biggest issue was with vendor lock-in. When your app
succeeds, and the backend is dependent on parse being in business, parse
becomes a liability. Eventually you will need to take on a project to "migrate
off parse." And you know what? That sucks for parse, because they were
counting on power users to pay the big bills. But in reality, once a customer
became a "power user," they started the process of migrating off parse.

When parse shutdown, I initially felt vindicated - ha! Vendor lock-in. But
they handled the shut down and transition to open source _extremely_ well. As
a result, parse-server is now an open sourced, self hosted product with all
the benefits of Parse.com but without the vendor lock-in!

I've been using parse-server exclusively for new projects. I'm very happy with
it and it is always improving (though backwards compatibility has been hit or
miss... but that's the price of rapid releases). It's very easy to setup, and
does a lot of crud for you (user registration, email confirmation, login,
forgot password, sessions). You can call the API from an open source SDK in
almost any language on any platform (note: I'm the maintainer of ParsePy).
Also, because you have direct access to mongo, you can optimize query
performance outside of parse. For example you can create unique indexes, where
with parse you had to query for duplicate objects at application level.
There's even initial support for Postgres instead of mongo. Also, the
dashboard is nice for non-technical people on your team to understand your
data structures and ask more informed questions.

I'm not sure I would ever use another BaaS. It just seems like such a dumb
decision to offload the entire backend of your app to a proprietary service.
If the service was totally open source from the beginning, with a self hosted
option, then I would consider it. At least that eliminates the threat of
vendor lock-in. I get the feeling that the excellent handling of parse.com
shutdown was an exception to the rule. I don't want to take unnecessary risks
with the most important code of a project.

~~~
erikwitt
I agree, the parse shutdown was organized extremely well. The open source
parse server, one year of migration time and a ton of new vendors that now
offer to host your parse app, all made it much easier to handle the shutdown.
It's also great to see the community still working on the open source server.

That said, there are a lot of upsides to having a company work full-time on
your proprietary cloud solution and ensure its quality and availability. If an
open source project dies or becomes poorly maintained you are in trouble too.
Your team might not have the capacity to maintain this complex project on top
of their actual tasks.

Also open sourcing your platform is a big risk for a company. Take RethinkDB
for example: Great database, outstanding team but without a working business
model and most recently without a team working full time, it is doomed to die
eventually.

Nevertheless, we try to make migrating from and to Baqend as smooth as
possible. You can import and export all your data and schemas, your custom
business logic is written in Node.js and can be executed everywhere. You can
also download a community server edition (single server setup) to host it by
yourself.

Still a lot of users even require proprietary solutions and the maintenance
and support that comes with it. And often they have good reasons, from
requiring a maintenance free platform to to warranties or license issues.
After all, a lot of people are happy to lock into AWS even though solutions
based on OpenStack, Eucalyptus etc. are available.

------
CurlyBraces
"applications were often inconsistent" \- I've heared this about parse before.
Always thought this is due to using MongoDB. If you use the same database how
can you enforce more consistency?

~~~
erikwitt
Although MongoDB has its limits regarding consistency, there are things that
we do differently from parse to ensure consistency:

\- The first thing is that we do not read from slaves. Replicas are only used
for fault tolerance as it's the default in MongoDB. This means you always get
the newest object version from the server.

\- Our default update operation compares object versions and rejects writes if
the object was updated concurrently. This ensures consistency for single
object read-modify-write use cases. There is also an operation called
"optimisticSave" the retries your updates until no concurrent modification
comes in the way. This approach is called optimistic concurrency control. With
forced updates, however, you can override whatever version is in the database,
in this case, the last writer wins.

\- We also expose MongoDBs partial update operators to our clients
([https://docs.mongodb.com/manual/reference/operator/update/](https://docs.mongodb.com/manual/reference/operator/update/)).
With this, one can increase counters, push items into arrays, add elements
into sets and let MongoDB handle concurrent updates. With these operations, we
do not have to rely on optimistic retries.

\- The last and most powerful tool we are currently working on is a mechanism
for full ACID transactions on top of MongoDB. I've been working on this at
Baqend for the last two years and also wrote my master thesis on it. It works
roughly like this:

    
    
       1. The client starts the transaction, reads objects from the server (or even from the cache using our Bloom filter strategy) and buffers all writes locally.
    
       2. On transaction commit all read version and updated objects are sent to the server to be validated.
    
       3. The server validates the transaction and ensures the isolation using optimistic concurrency control. In essence, if there were concurrent updates, the transaction is aborted.
    
       4. Once the transaction is successfully validated, updates are persisted in MongoDB.
    

There is a lot more in the details to ensure isolation, recovery as well as
scalability and also to make it work with our caching infrastructure. The
implementation is currently in our testing stage. If you are interested in the
technical details, this is my master thesis: [https://vsis-www.informatik.uni-
hamburg.de/getDoc.php/thesis...](https://vsis-www.informatik.uni-
hamburg.de/getDoc.php/thesis/868/Thesis_Erik_Witt.pdf)

------
esseti
would be intresting understing more about "Static pricing model measured in
guaranteed requests per second did not work well." . what happened and what
would have been a better solution afterwards.

~~~
DivineTraube
The core problem was that the sustained number of requests did not really
cause any bottlenecks. Actual problems were:

\- At the Parse side: expensive database queries on the shared cluster that
could not be optimized since developers had no control over indexing and
sharding.

\- At the customer side: any load peaks (e.g. a front page story on hacker
news) caused the Parse API to run into rate limits and drop your traffic.

~~~
chrischen
I feel like for indexing it could be done dynamically, based on on performance
analysis from performance and query profiling. This way you can also do it
without really understanding the application logic at all.

~~~
erikwitt
That's actually exactly what parse did. They used a slow query log to
automatically create up to 5 indexes per collection. Unfortunately this did
not work that well especially for larger apps.

I guess 5 indexes might be a little short for some apps. On the other hand too
many or too large indexes can get a bottleneck too. In essence, you want to be
quite careful when choosing indexes for large applications.

Also some queries tend to get complicated and choosing the best indexes to
speed up these queries can be extremely difficult especially if you want your
algorithms to choose it automatically.

~~~
simscitizen
We created more than 5 indices per collection if necessary. But fundamentally
some queries can't be indexed, and if you allow your customers to make
unindexable queries, they'll run them. Think of queries with inequality as the
primary predicate, or queries where an index can only satisfy one of the
constraints like SELECT * FROM Foo WHERE x > ? ORDER BY y DESC LIMIT 100, etc.

~~~
erikwitt
That is absolutely right. You can easily write queries that can never be
executed efficiently even with great indexing. Especially in MongoDB if you
think about what people can do with the $where operator.

What would in retrospect be your preferred approach to prevent users from
executing inefficient queries?

We are currently investigating whether deep reinforcement learning is a good
approach for detecting slow queries and making them more efficient by trying
different combinations of indices.

~~~
inlined
Fwiw, the Google model is to just cut unindexable queries from the feature
set. You can only have one sort or range field in your query IIRC in DataStore

~~~
DivineTraube
The Google Datastore is built on Megastore. Megastore's data model is based on
entity groups, that represent fine-grained, application-defined partitions
(e.g. a user's message inbox). Transactions are supported per co-located
entity group, each of which is mapped to a single row in BigTable that offers
row-level atomicity. Transactions spanning multiple entity groups are not
encouraged, as they require expensive two-phase commits. Megastore uses
synchronous wide area replication. The replication protocol is based on Paxos
consensus over positions in a shared write-ahead log.

The reason for the Datastore only allowing very limited queries is that they
seek to target each query to an entity group in order to be efficient. Queries
using the entity group are fast, auto-indexed and consistent. Global indexes,
on the other hand, are explicitly defined and only eventually consistent
(similar to DynamoDB). Any query on unindexed properties simply returns empty
results and each query can only have one inequality condition [1].

[1]
[https://cloud.google.com/datastore/docs/concepts/queries#ine...](https://cloud.google.com/datastore/docs/concepts/queries#inequality_filters_are_limited_to_at_most_one_property)

------
astral303
What is the reasoning behind not using sharding and homing each customer to a
replica set?

How did you deal with customers growing large or spiking in usage? How did you
manage such hot spots?

Sharding is one of the key benefits of MongoDB (and frankly most NoSQL
solutions). Of course, you have to pick a good shard key.

One reason I can think of is that IMO sharding on MongoDB before version 2.4
had too many issues to be production reliable. If Parse started with MongoDB
2.2 or earlier, then I can see how they would avoid sharding.

------
anilgulecha
Interesting new take on baas with Baqend. Looks like gomix?

~~~
DivineTraube
There are definitely some similarities. Like in Gomix, the idea of Baqend is
to let developers easily build websites, bots and APIs. However, Baqend has a
stronger focus on how data is stored and queried, while Gomix is more focused
on a rapid prototyping experience inside the browser.

In my opinion, the whole movement about BaaS is all about making things as
smooth as possible for developers and shorten the time-to-market to its
minimum. What some providers like Parse lost on the way, are the capabilities
for small prototypes to grow to large systems. That requires being scalable at
both the level of API servers, user-submitted backend code and the database
system. And at some point, it also requires letting the user tune things like
indices, shard keys and execution timeouts. This is the route we took at
Baqend. We do not want to be the simplest solution out there, but we aim to
provide the fastest (we use new web caching tricks) and most scalable one (we
expose how the database is distributed).

------
csmajorfive
Can I suggest you edit your title? I thought someone on our team had written
this but, in this context, "lessons learned" is kind of misleading.

~~~
DivineTraube
Could you share what your lessons learned were? That would be a very
interesting perspective. Do you disagree with some of the points?

------
simscitizen
Clash of Kings used Parse, not Clash of Clans.

~~~
DivineTraube
You're right of course, I just changed that.

------
marknadal
If the database is the bottleneck, why are you using the same (MongoDB)
database as Parse (and Firebase) were?

The problems described in the article are quite literally are my pitch deck,
which I have successfully raised from billionaires Tim Draper and Marc Benioff
of Salesforce, for [http://gun.js.org/](http://gun.js.org/) . So why did you
decide to stick with MongoDB when many other databases have tackled solving
those problems?

~~~
mdekkers
_var gun = Gun(
'[https://gunjs.herokuapp.com/gun');](https://gunjs.herokuapp.com/gun'\);) _

_gun.put({hello: "world"}).key('random/WVNqluP4q'); _

and

 _Databases can be scary_

Who exactly are your target market?

~~~
marknadal
People who are fed up about databases not working. That is primarily
developers, but also happens to include some government clients as well.

~~~
mdekkers
If you are in this business professionally as some form of engineer, thinking
that "databases are scary" is like your car mechanic thinking "engines are
scary"

