Edit: I can confirm: does not allow the UTF-8 null character in strings: https://docs.aws.amazon.com/documentdb/latest/developerguide... ... It is written on top of PostgreSQL.
I kinda expected them to build it on top of DynamoDb's backend and provide the same kind of "Serverless" on demand experience, but I guess the architecture didn't fit, or maybe this was just faster.
However, the fact that writes aren't horizontally scalable makes it a laughable nosql database but it probably satisfies the checkmark for enough of their enterprise customers that it will be a mild success and they'll keep it on life support like simpledb forever until they implement a proper solution assuming there is enough demand for it.
ElasticSearch on the other hand...
If only they had a competitor that could launch the same products a few months later but offered higher reliability off the bat, that could eventually force Amazon to improve their reliability or risk losing customers long term.
Being first to market doesn't ensure eventual market dominance. Sure, it could give you important feedback. But if your product is subpar, the feedback will have a ton of noise and possibly be useless. Plus it's not worth creating negative externalities and earning the reputation.
Reliability is the trickiest of the three because it requires the customer to architect their solution with multi-AZ support in mind, but AWS always provides the foundation for that architecture.
Could they, and should they provide more features and a better developer experience around building fault tolerant solutions? Absolutely! But I certainly don't think they have a bad reputation for reliability.
Doesn't Azure Cosmos DB do this? From https://docs.microsoft.com/en-us/azure/cosmos-db/introductio...
> You can elastically scale throughput and storage, and take advantage of fast, single-digit-millisecond data access using your favorite API among SQL, MongoDB, Cassandra, Tables, or Gremlin.
Haven't used it though, so would welcome some real world experience.
They have, it's Azure. I'm even a little bit scared because no one here is mentioning CosmosDB... It seems to me that most of the community only knows aws products.
Customers are paying AWS so that their SREs don't get called, they don't care if the AWS SREs do as long as the system keeps running.
Based on the supporting quotes at launch from Capital One, Dow Jones and WaPo it sound like enough customers are ok with vertical write scalability and (pretty awesome) horizontal read scalability for now because it fits their use case and is better than what they had before.
Also consider that since the cluster management overhead has been removed from the customer, they can essentially "shard" by using a separate cluster for each sufficiently large service/org/dept, which might actually work out better for them in some respects.
Perfect is the enemy of good enough, the architecture might be laughable to you, but it is probably miles ahead of what the customer was using before.
And the nice thing about this hypothesis, you can test it by looking how successful DocumentDB will turn out to be. ~
I think it works, and AWS has yet been brought down by this horizontal complexity. Quite an achievement, but might not be a satisfying experience for the engineers work there.
The downside is that a lot of their products lack polish which sucks. On the flip side even when they are launched with minimal features, they do tend to be reliable, durable and secure, which is important when it comes to data related services.
I wonder how widespread this view is. I suspect it's more widespread than Amazon realise. They may have optimised into a local maximum where they get a lot of value from being first to market, but could potentially get more by being first to "viable to trust a business on".
As far as being "viable to trust a business on" the numbers don't lie, AWS is number one because customers are running their businesses on AWS. The fact that DocumentDB launched with supporting quotes from Capital One, Dow Jones and WaPo shows that customers were clamoring to use it even before GA.
Remember a lot of these customers are coming to AWS because they tried doing themselves and stuggled. When it comes to data, customers trust AWS more than they trust themselves, and rightly so.
AWS also has not had a reputation for deprecating services it launches. I find very little risk in taking a dependency on something AWS releases.
They already are viable and trusted by multiple billion-dollar companies and governments.
This focus on actually meeting needs today is what keeps AWS on top while the others take 2 years to launch minor service upgrades.
The Aurora storage subsystem is much more limited in terms of horizontal scalability and performance, they probably chose it because it was a better/quicker fit.
There was work underway at the time I left to replace InnoDB with WiredTiger. It seemed to be very slow going, and I suspect WiredTiger being acquired by 10gen had a part in it. They also had only 1-2 engineers on the project of ripping out MySQL and replacing it, in a long-lived branch that constantly dealt with merge conflicts from more active feature development happening on mainline.
Aurora, simply by virtue of being newer and learning from DDB's mistakes (in the same way DDB learned from SimpleDB and the original Dynamo) probably has better extension points for supporting (MySQL, Postgres, Mongo) in a sane way.
Then again, the relationship between AWS and Oracle is even more contentious and Aurora MySQL is one of AWS's most popular products so I don't think they are terribly worried about building on competitor's technologies.
At least when I was there, the strong focus was always on adding new features (global & local secondary indexes, change streams, cross-region replication, and so on) to keep up with the Joneses (MongoDB et al).
Meanwhile, a bunch of internal Amazon teams were taking a dependency on it instead of being their own DBAs, and those teams didn't care that much about the whiz-bang features, they just wanted a reliable scale-out datastore that someone else would get paged about when some component failed.
Adding features at a breakneck pace while keeping up umpteen-nines reliability and handful-of-milliseconds performance meant tech debt and non-user-facing improvements, including WiredTiger, all got sidelined. Around the time I left, our page load was around 200 per week. That's one page every 50 minutes, 24/7, if you're keeping score at home.
I would love to get a behind the scenes look at the process of gradually improving the components of DynamoDB with better technologies, while still maintaining reliability and performance.
Apparently, they are using a 1:1 mapping between a collection and a table. Either by flattening the document or by using jsonb or equivalent. I'm not a big believer this is good for performance reasons, at least compared to a more normalized approach like the one we did for https://www.torodb.com But they may change it in the future --if they don't expose the SQL API to their internal representation.
I led a C# project where we could seamlessly switch back and forth between Mongo and SQL Server without changing the underlying LINQ expressions.
We sent the expressions to the Mongo driver and they got translated to MongoQuery we sent the expressions to Entity Framework and they got translated to Sql Server.
I’ve seen a LINQ to REST API provider.
I am however really hoping Amazon provides a MySQL 8.0 compatible version of Aurora with full support for its new hybrid SQL and Document Store interfaces courtesy of the X DevAPI and lightweight "serverless" friendly connections courtesy of the new X Protocol.
That way your don't have to choose just one approach, and you can have your data in one place with high reliability and durability.
My ultimate pipe dream would be that they also provided a redis compatible key/value interface that allows you to fetch simple values directly from the underlying innodb storage engine without going thru the SQL layer, similar to how the memcached plugin currently works
X DevAPI and X Protocol/X Plugin could team up and map K/V style access to the server internal InnoDB API instead of using a SQL service as it is currently done. They could try to do it "transparently" or let you set hints. Whatever is desired from an application standpoint.
Maybe not (but OP makes a lot of good points for why it is), but it is still based on the aurora limits, 64TB of size, 15 low latency read replicas in minutes, and presumably 1 write capacity which makes it a laughable nosql system since it cannot scale past 1 servers write capacity.
From the docs:
Changed in version 2.0: Version 2.0 of the MongoDB Connector for BI introduces a new architecture that replaces the previous PostgreSQL foreign data wrapper with the new mongosqld.
> Adam has endured the challenges of the open core model, and is refreshingly frank about its economic and psychic tradeoffs. And if he doesn’t make it explicit, Adam’s fundamental optimism serves to remind us, too, that any perceived “danger” to open source is overblown: open source is going to endure, as no company is going to be able to repeal the economics of software. That said, as we collectively internalize that open source is not a business model on its own, we will likely see fewer VC-funded open source companies (though I’m honestly not sure that that’s a bad thing).
The code needed to run those servers is the secret sauce and a huge competitive advantage, but with open source software you're giving away the secret sauce and the business victory goes to the one with the most business friendly servers
(There are many dimensions to "business friendly", a big one of which is "it's easy for us to start using this additional service since we're already paying this company for other services")
> ...for those open source companies that still harbor magical beliefs, let me put this to you as directly as possible: cloud services providers are emphatically not going to license your proprietary software. I mean, you knew that, right?
MongoDB Inc cannot make Amazon pay commercial license fees. That is not a thing that will happen. They have a lever in front of them with two positions, one of which is "large cloud companies might use your software for free", and the other is "large cloud companies will not use your software at all". They didn't like the first option, so they gave the lever a yank, but they're not going to like the second option, and there is no third option.
The way out is not to try and build a business on the assumption that people who have no interest, requirement or reason to give you large amounts of money will inexplicably do so anyhow. :)
This thread already has people eyeing up DocumentDB's pricing and comparing it favourably to MongoDB's competing Atlas service, and it's almost unthinkable to suggest that Atlas can compete on price with Amazon. The way to win this game is not to play; the rules are not in your favour.
Was that even the goal? My impression of the licensing change was not that they expected to Amazon to pay fees for offering a hosted MongoDB service. It was instead to lock Amazon out, and keep MongoDB Inc. as the only "cloud provider" of a hosted MongoDB service (perhaps still on top of AWS but with separate management interface).
Oh absolutely. I don't think they really thought they could force Amazon to license MongoDB, but I do think they believed they could force Amazon to not offer something that competed directly with Atlas.
That hasn't worked out for them very well.
(Not that I think leaving the license alone would have worked out any better. To the best of my knowledge, the MySQL, Postgres, Redis, and Memcache projects have not particularly benefited from Amazon building RDS and Elasticache on top of them, and I see no reason to think Amazon would have contributed a bunch of great patches upstream for MongoDB either.)
Unlike MongoDB, it is a real volunteer-led open-source projects, and the goal is to provide an excellent database to users rather than make money. Having easy-to-use cloud hosted versions available helps with attracting users, mindshare, and perhaps in the long run developers to the project itself. Having cloud hosted versions from big vendors means that it's easy to justify "we'll use PostgreSQL for this project" to management or clients.
On another note, anyone that doesn't think API design is a creative endeavor and worthy of protection probably has never made a great API before. It may be OK to accept that and also let other people use the API for free but I think ruling that it isn't is BS.
Like, “how many ways can you do a date api”, and then turn around to look at the original java Date api, the Calendar api, JodaTime and JSR310.
Maybe the fundamental properties of the universe aren't copyrightable/trademarkable/patentable, but what you CHOOSE to do with those - what API you design or what widget you build out of it certainly is.
So if ReactOS gets popular but doesn't support Windows 10 APIs, will it be harming the windows ecosystem? If popular implementations of a tool exist that don't chase other (official or not) implementations' features but still get lots of users, that probably means that the popular implementations provide other benefits.
> API design is a creative endeavor
I agree with that.
> and worthy of [legal] protection
But not that.
There are thousands of very successful and profitable software companies that make proprietary products and offer managed services, training, support, etc. It's a great business, but it's not going to offer 100x wild startup growth.
These companies would all do fine if they bootstrapped or took a small seed/loan instead of taking on 100s of millions.
Unfortunately, something that is good, and something people want to use, are not the same thing. People will use AWS's offering even if it is worse and harder to use, because it is bundled as part of AWS. That is a safe option (it can't be that bad if AWS has released it) and an easy one (no need to think about what to use, you are using AWS already.
Being a big provider of virtual machines puts them in a very strong position to sell loads of other stuff.
Spoiler: they do not
I've been using Atlas for over a year now and I don't have any complaints. It was super quick to set up and I've never had a single issue in terms of performance or availability.
What have your issues with Atlas been?
Realistically, the next step you will see, unless something changes, is that they will start going after people for API duplication. They have precedent (currently) on their side in the US.
None of the reasonable players will touch this, but you can be sure some VC backed "open source" player will be willing to touch this 3rd rail in exchange for a Series A.
IANAL, but since they already released the API as open-source under the Apache 2.0 license, this avenue is closed off to them.
Plenty of reasonable players will touch Oracle v. Google going forward. I’m as eager to debate the opinion as other counsel. But procedural history demonstrates directly, not theoretically, that it’s effective against tech giants.
In the matter of API Owner v. Google, if API Owner touches that “third rail”, Google gets the shock.
Successful open source does not require someone making money off developing it. It is successful when it is something that helps a profitable company but is not core to their business; then, they benefit from making it open source and having everyone contribute to its development and maintenance.
Or, you make money off support and consulting.
The key take away is, you aren't going to make money off selling licenses for open source. Which is good, I think.
I suppose Mongo could sell exceptions to cloud companies, the way other companies dual license libraries or frameworks. But even Mongo’s bread and butter paid deals aren’t primarily about alternative license rights for open code. They’re about closed add-on code and services, as you describe.
Dual licensing, on its own, is an old and plenty good model for funding development of open source code. I’ve heard wind of dual licensing deals done decades and decades ago, maybe even before GPLv2.
The question is whether giants will pay the cost of reimplementing entire stacks, core and shell. I don’t have the time myself, so I’ll have to wait on a report about how compatible AWS DocumentDB really is.
Given AWS history, I’d expect they’ll get most of the popular functionality, most of the way, but gotchas will abound, and they’ll never hit 100%. Switching cost of code won’t bottom out unless DocumentDB takes lead mindshare, which closed clones rarely manage.
Just at a cursory glance it certainly seems like only the apache 2.0 mongo api license would apply. But I guess mongo could try to force the sspl on amazon?
Then a combined Google + Amazon + Microsoft may finally be able to reverse the API Copyright insanity that is hovering ominously over the tech industry, and Oracle can continue to be a shining city upon a hill of shitty technologies you should never allow your business to adopt.
It took too long for the open-source community to figure out that the cloud providers are killing them, now it's too late. Well played, AWS.
How are service providers killing FOSS? That doesn't make sense. Permissive FOSS licensing allows anyone to use their software, regardless of how it's used, and that's how it should be.
That's how it's killing "FOSS". Extend and Extinguish. This is not a new playbook.
Enterprise isn’t gpl, but source is provided.
(This could have been easily answered with a google search, as you pointed out)
If it were so easy, you could have provided the citation yourself in that comment.
I think that it's too late considering AWS already did that to most of the industries but here is Hazelcast's take: https://www.linkedin.com/pulse/open-source-needs-protect-its...
'Together with optimizations like advanced query processing, connection pooling, and optimized recovery and rebuild, Amazon DocumentDB achieves twice the throughput of currently available MongoDB managed services.`
But if you have a medium-sized data set (eg. 50+ GB), this is definitely competitively priced. More RAM, storage, compute than Mongo Atlas and Compose for less money.
Here's hoping they introduce cheaper options!
It's tiny if you're a massive company and it's massive if you're a tiny startup.
Heck, you can even just use grep over 50GB reading straight from disk. It's tiny.
50GB isn't trivial, but it's utterly manageable.
Please elaborate why you think 50Gb is anything other than a small dataset that can fit in memory on any half-decent server though.
Correct pricing strategy needs to be per request or per instance, AWS is charging for both
The key point is illustrated by this quote from their main landing page: "storage and compute are decoupled, allowing each to scale independently".
This suggests it is built on top of the Aurora storage layer, or something similar, as other comments have suggested. This means there is a real cost per I/O operation because you aren't limited by the physical hardware of the compute instances, you get "free" storage nodes underneath that do much more than traditional storage and thus have to be built into the pricing structure.
It is definitely not going to be the cheapest possible solution for all use cases, but do the math before you reject it. If it does follow the Aurora pattern, then the number of I/O operations you are billed for will be a lot less than you may think because, to use another quote from their product page, "Amazon DocumentDB reduces database I/O by writing only database changes to the storage layer, avoiding slow, inefficient, and expensive data replication across network links". I think that quote is harder to understand without background as it sounds like market speak, but lines up very well with some of their in depth Aurora whitepapers, such as https://www.allthingsdistributed.com/files/p1041-verbitski.p... Again, I haven't seen evidence this is based on Aurora but the details they talk about line up really well.
Pricing strategy has little to do with customer happiness in aggregate. Every price will make some customers happy, and other customers feel gouged, because different customers extract different amounts of value from your product. The key to protect yourself from competition isn't to spend time worrying about how pricing affects your aggregate customer volume, but about whether your customers are happy. Maybe some customers are unhappy because they feel gouged. Maybe you could make them happier by reducing prices. But maybe, you're better off letting them go, if they represent a small minority of your users, and instead focus on what a majority of your users might appreciate more - better service, relevant features, etc. which make them happier.
[Edit] Amazon employee working in Physical Consumer (not AWS). Asking out of personal curiosity.
You too? I'm in AFT. I posted the original "whatever the customer is willing to pay" comment. Mostly just offhand and yeah there's a lot of nuance to it.
I don't mean that anyone should want to individually gouge each customer, but when running a business one should pick a price whereby the total long term profit is maximized.
Your pricing determines the number of customers. Your pricing also determines the profit on each customer. But choosing your pricing strategy correctly, you should have some people who won't buy your product.
Do you have a better idea of what this is then they do?
Considering they already have launch customers actively using this product and there are several comments on this page saying pricing is better than MongoDB?
I think the idea is that by charging precisely where they incur costs, they can be much more reactive to different usage patterns, and therefore be more competitively priced overall.
Although it certainly does create lock-in due to not being able to figure out your billing and accurately model alternatives.
It's targeted at enterprises like mine who currently use MongoDB on premise and are looking for a managed solution. The advantage of AWS over Atlas is you can use the same security and governance approaches e.g. IAM policies, ADFS/SAML integration, Cloudwatch/Cloudtrail etc.
Also I feel that they HAD to offer this to counter Azure CosmoDB
Azure tried to get fancy, with side sliding panels all over the place, and it is barely useable. The nicest thing I can say is it is "quirky." It isn't really productive however, particularly not on my 1080p monitor at Windows 10's default 125% DPI.
I literally quit Azure's Application Insights and went back to Google Analytics simply because I hated the Azure UI with a burning passion of a thousand suns.
The concept of writing queries is good, but if that's the only way you can get at your data you better make it damn easy, and they didn't. I'm sure for full time data pros it is a dream however.
Our Azure SA was giving us a presentation and actually got confused himself. "So that's TFS... I mean VSTS... Actually wait, it's Azure DevOps now?"
Recently I made a typo on a formal document. Wrote "AMI" when I meant to to write "IAM". Oops.
In other words, DocumentDB is only a drop-in replacement for MongoDB if you weren't using any of the features Amazon decided not to support.
Happy to be corrected if I'm misreading the documentation!
Having said that, when we were working on https://www.torodb.com we discussed how we'd implement the oplog. And actually, based on PostgreSQL's logical decoding (LD), it wouldn't have been a great deal (there are some gotchas, but LD brings much of what you need. So I won't be surprised if this would be implemented sooner than later.
Given that current Amazon leaders actually came from Microsoft's data platform group this leaves a bit of a bad taste behind.
I'm not working for either company.
See: Apple licensing the iOS name from Cisco before announcing the name change.
- "replicates six copies of your data across three AWS Availability Zones (AZs)" 
- "Amazon DocumentDB uses a distributed, fault-tolerant, self-healing storage system that auto-scales up to 64 TB per database cluster." 
- "When writing to storage, Amazon DocumentDB only persists a write-ahead logs, and does not need to write full buffer page syncs." 
If they wanted to twist the knife they should get to work implementing a pass through migration option.
- control of the client and particularly its exposed featureset
- due to that, also control of the protocol and the ability to, for example, insert legally protected strings in the style of the Apple SMC signature into the handshake.
- ability to gate new features on the presence of an object like a copyrighted text, trademark, or even a crypto signature
- ownership of the name. AWS are pissing in Mongo's pool marketing themselves as compatible, and there are a variety of ways it could be made to backfire, if it were in Mongo's interests to encourage that outcome
- AWS focuses on breadth and very rarely nails any particular service. Their hosted Postgres for example still does not expose core features years later
- Following from that, AWS services on the whole are rarely best-in-class in terms of raw performance. I imagine Mongo could continue to easily compete on benchmark results running on AWS own infrastructure
I think this is a really interesting case, far more interesting than the technical minutia of Just Yet Another AWS service. It does not sit well with me whatsoever that they're basically ripping off a much smaller company's core tech while simultaneously borrowing their trademark (in a legally acceptable manner) as part of the marketing, but I also find it hard not to see a ton of potential upside from this for Mongo
But I can't help but think Amazon can and would easily fix those things if they mattered. Amazon's hosted Elasticsearch is a lot cheaper than Elastic's, and I'll bet that's enough to get people to use it.
By poor performance, I assume you mean IO? AWS Elasticsearch has supported i3 instance type (nvme on-instance storage) for well over a year now . Additionally, you could enable slow-logs to catch perf issues yourself 
> very slow to make cluster changes
Scale-out and access-policy changes happen in-place now and so happen much faster than they used to be.
> launch new clusters
In my experience, it depends on the cluster size, but usually, I see cluster being up in 20m. That's nice given that it sets up pretty much everything (spin up instances, apply access policies, run health checks, enable cloudwatch monitoring, snapshots, create route53 records, integrate with cognito, enc-at-rest via KMS, spin up load balancers, setup vpc resources etc) on my behalf.
I can also launch an Elasticsearch cluster myself in about 2 minutes via terraform, so 20 minutes is not super impressive.
That said I recognize Elasticsearch is actually quite a finicky beast to set up, and my setup only has to deal with the needs I have, and probably would be set up horribly for certain other people. I can see how a hosted system that has to deal with all the weird edge-cases of a few thousand customers would take longer to set things up.
From their SEC filing:
We have a history of losses and may not be able to achieve profitability or positive cash flows on a consistent basis. If we cannot achieve profitability or positive cash flows, our business, financial condition, and results of operations may suffer.
In reality long term profitability is the only metric that matters for a corporation
I know this is a blunt and harsh statement to make, but when you sell a service, you have zero native incentives to Open Source the way your system works. It just opens up Competition. This is not unique to AWS/Amazon. But their success gives them the power to have wide OSS damage.
This is, to me, the biggest reason why cloud portability should be something that every customer of a cloud service should have in their plans. Amazon as a company has shown no timidness in both "embracing, extending, extinguishing" their competition.
OSS literally built the internet and opened up the wold wild communication age, let's not be so short sighted that we don't see proliferation of cloud services ( specifically one having so much dominance), for what it really is.
>And while they’re at it, it would be great if they could please stop making outlandish threats about the demise of open source
>Adam’s fundamental optimism serves to remind us, too, that any perceived “danger” to open source is overblown: open source is going to endure
>and in the end, open source will survive its midlife questioning just as people in midlife get through theirs: by returning to its core values and by finding rejuvenation in its communities
This point is well known, and pretty much in every cloud providers marketing material.
If I could re-engineer MongoDB so that a monkey could administer, you'd recommend I still use the cloud model rather than sell binaries?
This move shows MongoDB’s approach to document databases is compelling. We’ve thought so for a long time.
A cloud-hosted, truly global and managed MongoDB, MongoDB Atlas, has existed for the last two and a half years and has been serving more and more satisfied users every day with some massive workloads.
MongoDB Atlas runs the full implementation of MongoDB in the cloud.
Many features of MongoDB are documented as not being implemented by DocumentDB: these include change streams, many aggregation operators including $lookup and $graphlookup. But beyond that, well let’s just say we’ve been staggered by how many tests DocumentDB has failed (no spoilers!).
The MongoDB API is not under an Apache license.
MongoDB drivers are still under the Apache license. The MongoDB server used to be licensed under AGPL and is now licensed under SSPL. The source code is open to all, as it has always been, at https://github.com/mongodb/mongo
DocumentDB is not cheaper than MongoDB Atlas. Preliminary estimates show this to only be the case with very large collections and very, very high read/write workloads.
There’ll be more next week over on the MongoDB blogs.
* 4 independently-developed competitive compilers (gcc, clang, msvc, icc)
* 4 independently-developed competitive operating systems (windows, macos, linux, and bsd --I'm grouping the BSDs as one since their source code has a common ancestor)
* 3 independently-developed competitive browser engines, soon-to-be 2 (edgehtml, gecko, webkit)
And it's been that way for a few decades now; doesn't look like anyone is interested in taking the resources to make another one of those.
Strange turn of events for MongoDB but I guess that's what happens when the interface is open and anyone can build a backend to it, especially a relatively simple document-store.
I get it, I love Aws, just wish this was priced differently.
I do this with Azure SQL Server instances, so a single instance can host all the 'non-critical' environments (dev, test, QA, demo) - works great!
> Amazon DocumentDB implements the Apache 2.0 open source MongoDB 3.6 API by emulating the responses that a MongoDB client expects from a MongoDB server
Do you have any other info/links related to this?
I was surprised to learn this. When working in Lambda, you have to choose between a relational database & a responsive API. It seems inevitable that AWS will fix this soon, but apparently this is a significant architectural problem. As I understand it, RDS instances should (must?) be accessed from within a VPC, and anything inside of a VPC needs an IP address, so the Lambda function has to wait ~10 seconds on cold-start for the Elastic IP service.
The only workaround I've heard of is to setup a service, such as CloudWatch, to call your Lambda function every ~25 secs to keep it "warm", but this seems anti-thetical to the value proposition of serverless architecture in the first place.
Of course, you could "just" use DyanmoDB, but IMO the query language is really limited, and I'm not sure I fully grasp the problem (why doesn't DynamoDB need to be accessed from within a VPC?)
This is due to the fact that DynamoDB's query API is a standard AWS API which means granular internal/external access can be provided through IAM mechanisms (ie: roles, temporary tokens, federation, etc.).
On the contrary, to access RDS, Redshift or DocumentDB you would use standard ODBC/JDBC/Mongo facilities, which do not rely on IAM mechanisms, leaving VPC/Security Groups as the only isolation option.
AWS services don’t have that issue because they’re accessible from anywhere on the network, even through an internet gateway / internal NAT.
Services with "native" AWS APIs use IAM for granular access management.
Other services can only support access restrictions using the network so that means VPC/Security Groups.
TL;DR: RDS and this new DocumentDB are essentially AWS managing your database VMs in EC2. Advantage is drop in compat with regular apps expecting to reach a local server, because it is local to your VPC and uses normal ports. Can make them public accessible via VPC firewall, but less secure that way. DynamoDB was designed from the ground up as an HTTPS API and that's the only way its accessed.
A note this is also nothing new, Azure Cosmos DB has had this for a while.
Mongo started as a very developer-friendly data store with lots of overstated claims to being a database. While earlier versions of Mongo were wildly dangerous to use as a business critical database, it has since matured and is now quite good at being a developer-friendly document-centric database. In my experience Mongo truly is developer-friendly as long as you don't try to use it as a full-blown transactional database with lots of complex data shapes and indexes.
I would not trust ES with anything but text search on a document store, and I would not trust Mongo with anything resembling multi-document transactions. With that said, they are both good at specific, different things.
MySQL and Postgres have their own baggage that makes them pretty terrible in some aspects. IMHO a JSON-over-HTTP API should really be table stakes for a database to be considered developer-friendly nowadays. (But please don't butcher HTTP like ES did and then claim to have that.)
No no no no, again no.
We don't need yet another shitty query language bolted onto one of the most error prone and annoying to type serialization formats while transmitting data on top of a by default stateless protocol that makes no sense for a database.
I'm sick of it.
SQL. The same queries will work in 95% of the case on any SQL db. There is a driver in almost every language that is robust. There are implementations great for every use case; embedded, scalable, transactional. That's friendly.
There is nothing wrong with SQL, it's freaking awesome.
The "drivers in almost every language" suck. They all suck. I've never seen a SQL driver and wire protocol that was not awful in some way. The statefulness is part of what makes them awful. We have better ways to keep track of state now.
Have you seen the PostgreSQL wire protocol? I recently built a logical replication client driver for a project and found the protocol to be excellent. After looking at the documentation, I'm no longer limited to languages that have drivers for Pg, because I know how easy it'd be for me to just write one.
Just because some SQL drivers and wire protocols are awful (looking at you, Oracle) shouldn't mean one should go running to the hills, let alone to JSON.
So, two protocols? Two standard protocols is rarely better than one.
> having such a standard protocol makes sense in the age of web apps
Only if the existing standard protocol cannot work with the "web", and we have plenty of history proving otherwise. Replacing the existing standard with the loose JSON would be, strictly, a downgrade; and unnecessary, because we already do interoperate JSON and SQL. See: PostgREST and the many REST & GraphQL frontends on PostgreSQL.
> particularly when NoSQL offerings that are perceived by the market as competitors (leaving aside whether they really are - perception matters more here) do that already
This is really more of going into a pig's pen and wrestling with them. A database should do the job of a database. Competing for perception in a market that cannot make sane decisions for itself is how we get MongoDB.
Mongo is a heck of a lot easier to configure and develop with, and works great as a general database with a rapidly changing schema.
Elasticsearch is great at solving specific problems like searching for items in a specific way, but it's got quite a learning curve and is pretty painful to host and configure compared to Mongo where you can have a prod instance going in seconds.