There's a lot lacking with DocumentDB, as evident from the feedback forum, that comparing it to Mongo is like comparing an infant to an adult. The infant might be cute, but it can't do a whole lot.
"As we were developing our new financial benchmarking service last year, we evaluated Microsoft’s Azure DocumentDB, but MongoDB offered much richer query and indexing functionality"
Total cloud-vendor lock-in. It's clear why the clouds want users investing in these difficult-to-migrate-from solutions...
> Third, we do it with love…
With love for our money sure. What the hell does that even mean? I really rolled my eyes reading that blog post. This is childish and out of place for an article trying to sell security.
Because if you can do without it, why bother? Developing an access layer costs time and money. If you can leverage the DB features to do what you need, you can make you stack simpler and more maintainable.
Reasonably good security practices are not that much effort, and really it's a case for respecting your users for the most part.
The security trust game is starting to blow up. Yahoo just lost $250million dollars to it.
In this case, one can make the argument that a custom proxy layer, running in your DC (that proxies between the database and your actual frontend app) should not be necessary if the database offers sufficient per-connection ACLs and is secure.
That's a big if though.
Who is this for then?
The missing piece for the AWS serverless story is a database that is suitable for writing real world applications. DynamoDB is far from suitable for that task, which leaves AWS serverless with no good database.
Does serverless somehow mandate a non SQL solution?
On the other hand RDS hides a lot of the complexity from you. You don't have to pick an OS, apply updates, secure it, manage it, configure it, or patch it. There are some number of virtual servers out there that are nominally running your RDS cluster, but it's all pretty theoretical.
So I'm not entirely understanding your point.
> you need to pay to have an instance running per hour
You are paying to have instances running with every other DB service too; they may just break it out on your bill a bit differently. :)
The real issue with RDS for me isn't that they haven't removed the server part from the equation (they have), it's that they haven't removed the RDBMS from the equation. Schema changes, data migrations, replicas, sharding, scaling: All the hard parts of running a RDBMS are still there.
If Amazon could somehow make a magical service that accepted SQL queries and somehow returned my data, I'd be ecstatic - but the difference between that and RDS isn't the fact that they're letting me know how much ram the virtual server which is nominally running MySQL for me has.
What I'm getting at is, a hosted DB is a hosted DB.. What makes SQL unsuitable for serverless?
Anyway, my bad, I now see your point :)
Touting "serverless" as some sort of mysticism that doesn't really mean anything useful doesn't really get anybody anywhere.
Thinking of setting up an EC2 instance running RethinkDB or PouchDB for my project (and for future projects).
What sort of database is effectively useless for querying?
Also they need to ditch the really, really confusiong and limiting scaling model. For a database that advertises scaling as one of its key strengths, DynamoDB sure has a bad scaling story.
Cassandra, Riak, Voldemort, HBase, Bigtable, Azure Table Storage, and many other implementations of wide column stores have similarly limited querying.
I'm also not sure what you mean by the limiting scaling model. I can go from 0 to 160k reads/second by turning a knob, and 160k is only the default limit (you can request higher limits).
It is not a document store. It's a wide column store. Use it for the right job and it does very well. Treat it like postgres and you are gonna have a hard time.
But yes, it's pricy. It may not be the best fit for some. Hopefully by the time you're taking 160k writes per second you have a solid business model. I mean, Twitter peaked at around 8000 tweets per second. What are you doing that requires 160k, and do you really need to be storing it?
For example by changing my query strategy I was able reduce the provisioned write units from 1900 to 150 (write units dominate the cost).
Sure, you likely have more than one table on RDS, so that cost is amortized, but when you get to the scale where you need 160k reads/s, you aren't going to have much more than that one dataset in a single instance.
It feels tedious at first but once you develop some good habits and frameworks around denormalization it becomes easy to do that from day one.
This is not really the case. There are database systems that can handle large scale and complex queries. Allthough usually at the price of providing reduced consistency guarantees.
The application is less flexible and required making a lot of decisions up front, but operationally it's fantastic.
You also need to care about how DDB does its underlying partitioning. It would be nice to turn the knobs and be able to trust you will get X reads/sec and Y writes/sec, but that is only true per node! Unfortunately, DDB gives you zero information about how many nodes your DDB table is running on! (Yes you can guess pretty well if you keep track of your usage rate and do some math).
So when provisioning, you need to be aware that if you have 100 provisioned read ops, but you have data on 5 nodes, you really only have 20 reads/sec if one key gets hot.
I agree it's pretty easy operationally, but you can get burned if you don't know how it works under the hood.
But you're right part of design for DDB is picking a proper partition key so you don't end up with hot shards.
What is your view of services, which provide functionality of some other software or SAAS and is API / Protocol compatible ?
Can API / protocols be copyrighted or patented? I believe not based on Google vs Oracle.
Also, there's a query playground if you want to try it out quickly: https://www.documentdb.com/sql/demo
As a SaaS it's not surprising DocumentDB got security configured, and it also won't be surprising when people lose data because they'll put '123456' as their password or commit their password to a public repository
Personally, I'm pretty happy with how easy it is to use the Azure Storage services (blob, tables, queues) as well as their Azure SQL offering. Far less arcane configuration options than you get with AWS's competing options. If only their compute nodes weren't so pricey.
In response to https://www.theregister.co.uk/2017/01/09/mongodb/ ?
"MongoDB databases are being decimated in soaring ransomware attacks that have seen the number of compromised systems more than double to 27,000 in a day."
I think that's a first, right?
MongoDB Atlas will manage MongoDB for you
Moving FOSS into the cloud as a SaS sounds kinda regressive to me...
I'm not saying this necessarily applies to compute engines or storage as a service or whatever, but something like gmail (SaaS) where your data is used to target ads at you could be considered exploiting your data. I would not put it beyond large companies to start considering doing the same on their storage-as-a-service offerings soon enough.
If Google, Microsoft or other companies start to look at the data to exploit it hey will lose trust, the customers and the data.
The it will be easy to point to those terms in Azure service.
> SaaS operators has also a history of selling data to others,
Then it willbe easy to link to news about those SASS operators selling the data.
> Many companies have restrictions against usage of SaaS for exactly those reasons...
Then it will be easy to bring examples of those companies with restrictions.
I have looked and I have not find any single case of Microsoft stealing Azure data. Or any big SaaS company like Amazon or Google doing that
> But I if you really want to find out (which I doubt
I'm not the one making unsupported claims without trying to provide any single example, perhaps the one that doesn't want to lok that their claims are wrong is not me
> I get the feeling you have some affiliation with MS?
No, I don't have any affiliation with Microsoft and it is not one of my most loved companies. And I'm not the one accusing others of shills of secret ties with companies. I'm also not the one making unsupported claims.
Also, Azure has the best suite of compliance\certifications that demonstrated their commitments.
BTW, Have you read the terms and conditions before speculating on them?
Still waiting any proof of your claim
Trust in the belief that Microsoft will act in your best interest regarding the privacy of your data.
But, isn't reasonable then to ask if Microsoft is actually trustworthy? PRISM, NSAKEY, Flame malware propagating via Windows Update, their 0day policy... I don't think Microsoft is trustworthy.
If you worry about interception, then code inspection and monitoring isn't going to give you any assurances. You'd have to run open-source software locally, audit it, and not put it on a cloud like Azure in the first place?
(And the NSAKEY was something completely different if you dig into it.)
I am just saying, in this specific case, should you trust Microsoft with your data? That's all.
If you don't trust MS you can trust one of the organisations that certified them for the strictest compliance and regulations in the public cloud space.