Hacker News new | past | comments | ask | show | jobs | submit login
Azure Cosmos DB: Microsoft's Cloud-Born Globally Distributed Database (muratbuffalo.blogspot.com)
179 points by ingve 8 days ago | hide | past | web | favorite | 126 comments

After trying CosmosDB I put it down pretty quickly. I can't say I'd recommend it to anyone.

Shockingly poor perf. Is CosmosDB really that bad or did we have it misconfigured? We're not sure, the docs didn't help us understand.

Random failures when connecting. Random errors (or worse, no errors but unexpected results) when querying.

Largely undocumented.

Painful interop with non-azure proprietary offerings.

Almost zero developer community. Most of the coverage of CosmosDB isn't by developers, it's by various MS affiliated blogs such as this one, which are more advertisements than resources.

Unfortunately CosmosDB seems to fit the Azure MO of investing more into sales/marketing than into engineering/support. I don't know about the general atmosphere, but in the circles where I work, Azure and its various proprietary components are losing developer goodwill at a staggering pace.

To the brave, I recommend trying to make a copy of a CosmosDb collection.

The only solution that doesn't require spinning up a VM, reading all the data and writing it back again is the Azure Data Factory [0]. Which according to Azure's own benchmarks [1] manages to copy only 2MBps between two CosmosDb instances each with 100,000 RUs ("Resource Units"). That's more than half a day for a small 100GB collection.

And in case you're wondering whether 100K RUs is a lot, consider the cost: 5840$/Month for single-region writes. 11680$/Month for multi-region writes [2].

Oh, and as a bonus - can you guess the official procedure for restoring a collection from a backup? [3]:

> If you have accidentally deleted or corrupted your data, you should contact Azure support within 8 hours so that the Azure Cosmos DB team can help you restore the data from the backups.

And yeah, it's 8 hours because they backup the database every 4 hours and keep the last two backup copies [3]

[0]: https://docs.microsoft.com/en-us/azure/data-factory/introduc...

[1]: https://docs.microsoft.com/en-us/azure/data-factory/copy-act...

[2]: https://azure.microsoft.com/en-us/pricing/details/cosmos-db/

[3]: https://docs.microsoft.com/en-us/azure/cosmos-db/online-back...

I am sorry to have to argue, but the multi-region writes are calculated N + 1, where N is the number of regions. So it would actually set you back $17520 [0]. And yes, I know that even their own documentation and calculator conflict on this, but they do charge the higher number.

0. https://docs.microsoft.com/bg-bg/azure/cosmos-db/optimize-co...

Hi, I am from Azure Cosmos DB Engineering Team.

We are in the process of updating the Azure pricing calculator to provide various options (e.g., # of regions, single vs. multi-master, etc.). The updates to the calculator are coming shortly.

In the meantime, please feel free to ping us at AskCosmosDB@microsoft.com, if you need any help with cost estimates or have any questions.

Even better.

Why is using a VM not allowed? They have the Bulk Executor for large scale update/ingest: https://docs.microsoft.com/en-us/azure/cosmos-db/bulk-execut...

You can also use the changefeed system which is stored as an event-hub stream and then save it to another collection using Azure Functions.

I can say this exact thing about most Azure services. Nothing really works as documented and everything is more expensive than it should be. I always feel like I’m missing something.

The perfected enterprise sales force at Microsoft. They are good, really good. Here is a fun anecdote - There was a lawsuit where it was alleged that Microsoft business practices against Novell were anticompetetive. Microsoft was forced to sell Suse Linux as part of the judgement. For the couple of quarters where this was in play, Microsoft managed to sell more Linux than Novell.

It's hard to get behind cloud services that require you to send an email to support engineers to reboot your instance.

Wow, is that really the case? The last “cloud provider” I remember this to be the case was SoftLayer. Is Azure really that bad?

With AKS (at least in early days, not beta or anything mind you) the kubernetes coordinator node was managed by Microsoft, after a some arbitrary amount of days/weeks it would become unresponsive and the recommended action was to open a ticket and wait for them it restart it.

Azure Kubernetes Service provides a free managed Kubernetes control plane. Certainly possibly that prior to general availability a support ticket had to be opened to restart this. The control plane's underlying VMs aren't exposed on purpose. That's similar to what other providers do with managed Kubernetes services.

Hopefully the unresponsiveness issue was addressed quickly after it was reported by you.

Believe it or not Azure is the #2 cloud provider (by revenue?).

Wha? I've built several (Currently about 20) enterprise application integrations across our infrastructure using Logic Apps, Function Apps, Service Bus, Cosmos etc. The cost of running these is less then our budget for office supplies. I'm literally paying pennies.

Proof: https://imgur.com/a/asem3d9

I've found that the "Azure Storage" services (blob, queues, table) are pretty decent (for what they are). If you don't have huge throughput or size needs that is. Different beasts, but definitely useful imho.

If I need really high load/throughput, I'd probably reach for Cassandra on DO for best bang for the buck. YMMV though, and not too big on Cassandra unless you really need it.

Cmon this is an extremely exaggerated opinion. I have used Azure and AWS and I find Azure to be much easier and user friendly.

Maybe at very large scale AWS shines through, but I would choose Azure over AWS any day.

you're literally the mythical unicorn, I don't know a single person that has appreciated using Azure.

I'd personally trade my first born with no second thoughts to use AWS instead.

I work in the European public sector, I can name you a few reasons to pick Azure over AWS. None of them are technical though, so you certainly have a point.

Microsoft is just better at selling the whole package to enterprise where your primary function isn’t tech.

They were first to incorporate EU legislation, and are frankly still the only of the three big cloud suppliers that can show us exactly where our data has been or passed through at any time.

Their support is also excellent. We pay Microsoft a lot of money, much more than the cost of Azure because of licensing, but with it comes support. If something breaks down, I can call Seattle directly after which we will receive hourly updates by phone until the matter is resolved.

With google you get to fill a form or chat with a bot. AWS is better, but again mostly to their big customers, and since we aren’t buying office 365 or licenses from them, we’re just not big enough to qualify for the support we need.

Surely you exaggerate. Azure is great I think AWS is garbage in comparison.

If it is documented at all. Even their .NET SDK blows

Hi, I am from Cosmos DB engineering team, I'm sorry to hear you ran into issues with our .NET SDK. We're constantly updating our SDKs, and welcome feedback on GitHub, please post the issues there and we will get back to you shortly. For the current generally available .NET SDK V2, the GitHub page is at [1]. The upcoming V3 version of Cosmos DB .NET SDK is open sourced at [2], please give it a try when you have a chance / tell us what you think. [1] https://github.com/Azure/azure-cosmos-dotnet-v2 [2] https://github.com/Azure/azure-cosmos-dotnet-v3

Thanks for the feedback.

I can second this. We evaluated AWS, Azure and GCP for a very large healthcare company and found that Azure was the worst cloud services provider for anything. This was in spite a big push by leadership to encourage Azure, developers just hated it

Between AWS and GCP even though AWS come out on top, GCP wins in quite a few services.

Could you go more into this? How did you go about the evaluation and what metrics were you using? How did you capture the 'developers just hated it' in a way that leadership understood?

I'm not disputing the outcome, just feel like it would be useful for others in a similar situation.

Gonna second this. We are also in a situation where a few of the higher management were clearly bribed to push our company into Azure direction but the amount of problems we see may destroy our company..

What tips in AWS's favor, roughly speaking? My general feel is GCP is slightly cheaper but is missing some offerings, while AWS is a mature 1-stop shop that's well documented.

Well Azure comes with the biggest compliance/certification package - of course upper management will love it.

Good luck with GCP support...

Unfortunately, I could only second this opinion and add a few points of mine.

1. Many unexpected behaviours which you would not encounter in any other software - for example when you're blocked by its firewall, it would let you connect on all 7 OSI layers, then kick you out with "Command failed with error 13: 'Not Authenticated' on server xyz:10255'". By the way even Azure Portal itself doesn't know how to handle this behaviour correctly and when not included in the firewall whitelist just blinks like crazy.

2. Many seemingly basic features commonly found on other database systems are for some reason missing here - it doesn't support users as such, just issuing resource tokens[1] (which you must implement yourself). Thus, most CDB users just go straight with the master key even on production.

3. It is not fully (nor at least nearly fully) compatible with any of the advertised APIs - SQL doesn't support joins, MongoDB misses quite a few features and adds a requirement for the shardkey to be part of the query. Which is not supported by Spring Data, so you must do low-level queries for everything but select (think updates field by field with null checks).

4. The tooling is horrible - as it is not compatible with neither mongodb nor traditional sql, you would need to include the "CosmosDB Emulator" in your build in order to have at least some predictability. The problem? It only runs on docker4win and a Windows VM would need to be running by all developers on the team and workers of the CI/CD. As of April 2019

5. It is CRAZY expensive and the pricing is not really transparent - in our case we had 20 MB of data, 4 1000RUs collections and multi-master write in Western and Northern Europe. The monthly bill? 300 EUR even though Azure Portal was displaying 1/6 of that cost - apparently for multi-master you multiply by the number of regions + 1 (2+1 = 3) and for geo-redundancy only by the number of regions (2), thus the 6x higher cost. The result of that was that we were paying several times more to manage 20MB of data than for all the remaining data - hundreds of GB.

Maybe if you really, really need the planetary scale, the master-master replication and the <10ms SLAs it would make sense and the hassle and cost would be worth it (though I seriously doubt it), but in any other case it is a massive mistake to go this route.

1. https://docs.microsoft.com/en-us/azure/cosmos-db/secure-acce...

Hi, I am from Cosmos DB engineering team.

1. This occurs when, after the connection has been established, CRUD commands are sent before the authentication command. Note that some MongoDB drivers have a bug where they are not authenticating the connection in rare cases and sending CRUD commands. Could you please check and upgrade to the latest driver of Mongoose/NodeJS-Native/NodeJS-Core driver, which has a rewrite of their client-side auth stack and fixes a lot of bugs. One of such issues is described in https://jira.mongodb.org/browse/NODE-1798 which was fixed by https://github.com/mongodb-js/mongodb-core/commit/7ffb4bbe05...

2. Resource tokens provide fine grained access control but it does require you to implement a custom token management service. We are working on support for Azure AD integration and support for custom roles and policies. The feature is code complete and should be made available soon.

3. Cosmos DB’s SQL query language is supported on a non-relational data model - which maps cleanly to document/column-family/graph. It does not support joins since (unlike a relational model), a single Cosmos container can contain items belonging to any arbitrary schema.

4. Re. API for MongoDB requiring a shard key, we are going to relax this restriction before 7/30/2019. As you noted, we do support an emulator for Windows. We are working on a Linux emulator and should be released later this year.

5. The portal was displaying incorrect cost estimate in certain cases. We addressed it and the fix is being rolled out.

Thank you for the feedback!

> It is CRAZY expensive and the pricing is not really transparent - in our case we had 20 MB of data, 4 1000RUs collections and multi-master write in Western and Northern Europe. The monthly bill? 300 EUR even though Azure Portal was displaying 1/6 of that cost - apparently for multi-master you multiply by the number of regions + 1 (2+1 = 3) and for geo-redundancy only by the number of regions (2), thus the 6x higher cost. The result of that was that we were paying several times more to manage 20MB of data than for all the remaining data - hundreds of GB.

I get your point and they should definitely fix either the cost calculation, or paste a disclaimer for multi-region pricing. But I want to (sincerely) ask how you could assume that multi region would _not_ cost more than single region.

You are perfectly correct, but I assumed that for 2 regions it would cost twice the amount, not 6 times.

Hi, I am from Cosmos DB engineering team.

Thank you for the feedback!

We have published a number of documents on pricing transparency and cost optimization of your Cosmos DB cost that we want to make sure everyone is aware of:

[1] Pricing model - https://docs.microsoft.com/en-us/azure/cosmos-db/how-pricing...

[2] Optimize cost for your dev/test workloads - https://docs.microsoft.com/en-us/azure/cosmos-db/optimize-de...

[3] Total Cost of Ownership (TCO) - https://docs.microsoft.com/en-us/azure/cosmos-db/total-cost-...

[4] Understand your bill - https://docs.microsoft.com/en-us/azure/cosmos-db/understand-...

[5] Optimize provisioned throughput cost - https://docs.microsoft.com/en-us/azure/cosmos-db/optimize-co...

[6] Optimize query cost - https://docs.microsoft.com/en-us/azure/cosmos-db/optimize-co...

[7] Optimize storage cost - https://docs.microsoft.com/en-us/azure/cosmos-db/optimize-co...

[8] Optimize reads and writes cost - https://docs.microsoft.com/en-us/azure/cosmos-db/optimize-co...

[9] Optimize multi-regions cost - https://docs.microsoft.com/en-us/azure/cosmos-db/optimize-co...

[10] Optimize with reserved capacity - https://docs.microsoft.com/en-us/azure/cosmos-db/cosmos-db-r...

If you have any questions regarding pricing, please feel free to reach out to us directly at AskCosmosDB@microsoft.com (direct line to the product team).

We do really appreciate the feedback.

Thank you.

CosmosDB is a JSON-based document store and it uses partitions to scale out your data. No document-store supports joins. The SQL support is only an interface to your data, it doesn't add relational functionality.

The partitionkey is required so that CosmosDB knows which partition to query. You can actually enable broadcast queries which will ask every partition if you don't want to include the partition key, but this will be slower and require more RUs.

In reality it is a document store indeed, but it is advertised by MS themselves as "globally-distributed, multi-model database service". So...

Same experience here. Also found it to be hideously expensive while setting all dials to minimum for experimentation and exploration. There may be a way to configure it for an affordable trial run, but it didn't motivate me to stay and figure it out.

Hi, I am from Azure Cosmos DB Engineering Team.

For development (before going into production) with Azure Cosmos DB, we suggest a number of options (also documented here: https://docs.microsoft.com/en-us/azure/cosmos-db/optimize-de...):

[1] Cosmos DB emulator (Cosmos DB Local version) - https://docs.microsoft.com/en-us/azure/cosmos-db/local-emula.... The Azure Cosmos DB Emulator provides a local environment that emulates the Azure Cosmos DB service for development purposes. Using the Azure Cosmos DB Emulator, you can develop and test your application locally, without creating an Azure subscription or incurring any costs. When you're satisfied with how your application is working in the Azure Cosmos DB Emulator, you can switch to using an Azure Cosmos DB account in the cloud. Support for Cosmos DB emulator on Linux is coming.

[2] We provide Try Cosmos DB for free experience - https://azure.microsoft.com/try/cosmosdb/. There is Try Azure Cosmos DB for free experience which gives you a time-limited full-service experience with Azure Cosmos DB. It is designed for trying out Cosmos DB, doing a tutorial, a demo, doing a quick start, a lab without requiring an Azure account or a credit card.

[3] We provide a free tier (Cosmos DB is a part of free account) - https://azure.microsoft.com/free. In Azure, we support a free tier with 12 month access to all popular services in Azure. As a part of free tier specifically for Cosmos DB you get 5GB in storage and 400RUs.

Thanks so much!

>Shockingly poor perf

>Random failures

>Largely undocumented

>Painful interop with non-azure proprietary offerings

>Almost zero developer community

>Most of the coverage ... isn't by developers, it's by various MS affiliated blogs such as this one, which are more advertisements than resources.

Welcome to Azure!

That's sad to hear. I've had a lot of fun playing around with Azure's table/blob storage services. Never deployed a large production app with them, but the performance seemed decent (eg. when using table storage as a json store). I did look at Cosmos DB but decided not to give it a try when I saw the pricing.

I like their table/blob and queue storage, but maybe it's because it's relatively mature and the domain is well understood. Most of the comparable services from other providers are generic analogs.

I'm not sure I feel much different across Azure Cosmos and Aws Dynamo. It seems super-easy to get locked into a relatively expensive proprietary offering in each case.

I second this, Azure Table Storage is an amazing product, but would be a more powerful one if they enable indexing on all columns and also any automatic backup options, which are only available on CosmosDB Table, which makes me thing that Azure wants to push me to use CosmosDB if I want those features, since they don't have a roadmap for this for the Azure Table Storage.

Azure Table storage is a key/value store similar to Hbase, Cassandra, GCP Bigtable, AWS DynamoDB and others. It's a very stable and scalable design which is why everyone has some version of it.

There's no way to add indexes to this data model, it's all done by just writing your data in another table using the key you want and the value pointing to the original table's row. Perhaps Azure could offer secondary indexes like AWS DynamoDB does but it's probably too complex of a feature to add at this point, which is why it took DynamoDB several years to get to working GSI.

Having the azure table storage be indexed into azure search with an azure function as the indexer delivers a pretty powerful alternative that isn't exactly batteries included, but close.

These are two very different products at two very different price points.

10TB of Table storage is 655$ 10TB of CosmosDb is 2500$

Putting it down quickly was your finely tuned technical instincts saving you from disaster.

Hi, I am from Azure Cosmos DB Engineering Team.

Please feel free to send us the details of your workload that suffered from poor performance to AskCosmosDB@microsoft.com (it is a direct way to reach the engineering team and it does not cost anything), and we can help determine why and provide with the recommendations.

For best practices and performance optimizations, the following links will help:





Also, the following conceptual docs will help:




We are doing a full refresh of our docs (rewriting a lot of the older content and adding new technical content). We welcome feedback on how we can make the docs better and invite everyone to contribute via public PR/Github. If there are any topics that are missing in our docs or are not easily discoverable, please let us know. We will make sure to address it.

We are committed to building a great community. As a relatively young database, we have ton of work ahead, but we are super committed to making sure we make developers successful.

I've used it briefly internally at Microsoft 2009, I had almost the same experience. Super slow, takes a while to even start the query, non-standard query language. For internal users there were even more hurdles: you had to get approval and quota before you were able to use it at all, and the default answer to both requests was "no", you really had to justify it. As an aside, at MS "no" was the answer to any request you make of Legal. Lots of bicycles were reinvented from scratch, with lots bugs, just because Legal would not approve the use of BSD or MIT licensed code.

Imagine my shock when I started at Google a year later and discovered Dremel (now BigQuery) where I could just run a query and get the results back in seconds over tens of terabytes of data, with no quotas or approvals.

I think having a system like Dremel is a huge competitive advantage, because you don't have to guess blindly about things, you can just query them and get results immediately. If Cosmos is still the best MS has internally, they need to roll their own Dremel asap. Lots of great people from SQL Server team work at Google now, but MS still has plenty of people who could do a competent petascale columnar DB with one hand tied behind their back.

That's impressive, considering that this CosmosDB development only started in 2014! You are talking about Cosmos which is still around but is now used as the basis of Azure Data lake [0]. However, I think the closest MS has to Dremel is actually Kusto [1] which has a custom query language but works pretty well.

[0] https://azure.microsoft.com/en-ca/solutions/data-lake/

[1] https://docs.microsoft.com/en-us/azure/kusto/query/

You're probably right. However, having worked at MS for nearly a decade in the past, I'd be stunned if Cosmos did not serve, at least in part, as the foundation, and CosmosDB did not inherit a good chunk of the team.

Dremel now uses standard SQL and unlike Kusto it supports multiway joins and a bunch of other things expected of a more "general purpose" analytical DB.

Ultimately MS efforts will be stymied by the fact that their underlying storage story is nowhere near as good as Google's Colossus.

I won't disclose the details, but that's another thing that blew my mind: I could do linear reads (which is what columnar DBs do nearly all of the time) faster than I could process the data, from hundreds, or even thousands of workers at the same time.

The IO throughput there is truly immense, which is especially impressive given that all of the storage is remote.

Cosmos and CosmosDB have next-to-nothing in common and were not built by the same people.

Why would you be stunned? They are completely separate systems.

It's like saying BigQuery and BigTable share something because they both have "Big" in their name.

It sounds like you're describing the internal big data platform "Cosmos", something which has been made publicly-available as "Azure Data Lake Analytics" and which has no relationship to "CosmosDB" other than a similar name.

Sounds like a branding mishap, Microsoft created the same type of hard to understand mess with the Microsoft Dynamics branding.

Like he said, the public product is called Azure Data Lake, and internal product names aren't brands (that's kind of the whole point of them).

You're conflating CosmosDB and Cosmos (which has been publicly released as Azure Data lake Analytics and botched imo).

You also forgot to mention that it is crazy expensive if you have lots of collections

Hi, I am from Azure Cosmos DB Engineering Team.

@llcoolv has responded and the response is correct.

For scenarios where you create a lot of collections/tables, we recommend to provision throughput on a database and share the provisioned throughput among all the containers – to save on costs. You can change the throughput at database exactly like how you will change the throughput at container level. It is the same API, it is supported via CLI, SDK, Rest API and Portal as well.

The entry point for database level throughout is 400RUs $24 per month and you can add any number of collections/tables that can share this throughput. You can scale up and down in the increments of 100RUs (or $6/month).

Please see the following links for details:

[1] https://docs.microsoft.com/en-us/azure/cosmos-db/set-through...

[2] https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-prov...

[3] https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-prov...

Please see the sample here for moving data:

[4] https://github.com/Azure/azure-documentdb-dotnet/tree/master...



I am aware that one can provision per database, but as another poster mentioned, you can't change that once a database is created, and also you need to specify something that's called partition keys and that can't be done using MongoDB driver. If I recall correctly, you don't even have examples available for that in CosmosDB Python API.

I mean yeah - it's possible. But it is also a pain in the ass.

Hi, I am from Cosmos DB engineering team.

Yes, for shared throughput database, you will need to create a new database.

- You can create MongoDB API collections with shard keys defined. It is fully supported by MongoDB driver

- Support MongoDB collections without shard key specified is coming in May

Thank you for the feedback on the docs and the missing code samples. We will add those in the docs.


Thanks. I wasn't aware that a shard key is mapped to a partition key.

I usually put multiple document types into a single collection and have a “type” field to get around this. The fields are indexed so this is efficient. To decide when to use a different collection I look at the scalability and other settings at the collection level, and group things that have similar needs together.

If you only have a few "type" options, that may be a poorly performing index unless it's used as a secondary to a query that narrows results more.

It actually supports throughput provisioning on database-level (or container-level as they sometimes call it). The only catch is that you'll have to recreate the db (on creation is the only time when you can enable it) and data migration is not exactly quick.

That's good to know. My team is currently using Neo4j, but may be forced to switch to either Neptune or Cosmos. I figured those were probably pretty similar, but this sounds like Cosmos might be a really bad idea.

I am a Cosmos DB graph engineer and would encourage you to take the service for a spin and build your opinion. This thread unfortunately has more emotions than truth to it. We build the service with best intentions in mind.

Feel free to reach out to us at askcosmosdb@microsoft.com any time with any questions.

Could you be a bit more specific on the what exactly is untrue?

That sounds really bad. I wonder how did Azure gain marketshare with a core offering (database) being so broken.

The core database is SQL Server and they also have PostgreSQL. This is what most people are using.

I should have been more precise. Cosmo DB is their core distributed database offering. Neither of SQL server of PostgreSQL is a distributed database. Both require applications to deal with the presence of multiple database nodes (sharding).

The absolute worst bug I have ever had to debug was using CosmosDB as table storage. Everything worked fine up until we hit about 1700 records in table storage. Once we hit that point it just stopped returning any records. We eventually find out that if you query using a field that isn't indexed instead of throwing an error or something sane like that it acts like everything is fine and returns an empty set of records.

Woa, wtf.

Is that indicative of the general quality level of cosmos?

Sort of. It doesn't really sanity-check anything, ever. It's more of a blank slate and you have to build whatever rules you want into your software layer.

The default mode is to index every field, so you can't get into the OP's situation until you start trying to fine-tune it. He basically turned off the index and then tried to search by it. This is not expected for people who just bring their RDBMS assumptions in and try to wing it without reading any documentation whatsoever.

I dont know if something changed, but there were no default indexes back when this happened (a year and a half ago). I had done no performance tuning, just created the document collections and added items to them.

It sounds like you inadvertently changed the index structure or deleted it. There have always been default indexes, and Microsoft's tuning advice is always to add special case indexes to the wildcard index they provision. If you know your data model well and want to improve write performance (RUs), you can delete the wildcard index. That's how the indexing policy has worked since the product launched.


Everything is indexed by default, this has always been the case.

What documentation though? Cosmos' documentation is appalling... it is not like you can learn these things from it. With Cosmos you are always on your own, trial and error, to figure out how it behaves

This documentation: https://docs.microsoft.com/en-us/azure/cosmos-db/index-overv...

There are dozens of pages there. It may not be perfectly organized but saying you can't learn anything from it is silly. And regarding indexing, the 2nd sentence on the page says this: "By default, Azure Cosmos DB automatically indexes all items in your container without requiring schema or secondary indexes from developers."

Google Datastore [1] works the same way. I imagine it's because indexes can be configured per row. So for example if you have a boolean value and only ever query for it to be true then you might set the index on only if it's true. The reason you might want to use indexes scarcely is because they cost money. For Google Datastore they used to cost extra write operations and storage. Now they just cost storage.


[1] https://cloud.google.com/datastore/

That must be the famous Mognodb compatibility layer kicking in. ~/s

This post is interesting but one or two MS employees did reveal some key details about CosmosDB in the original announcement thread: https://news.ycombinator.com/item?id=14308814


> There are many significant differences in capabilities, and design approaches between other systems (CockroachDB and Spanner) and Cosmos DB. At a very high level differences are at two levels - the design of the database engine and the larger distributed system.

> The database engine design is inspired by

> 1. LLAMA: http://db.disi.unitn.eu/pages/VLDBProgram/pdf/research/p853-...

> 2. Bwtree: https://pdfs.semanticscholar.org/7655/9c6cc259c6ab5baf7bd19d...

> 3. Schema-agnostic indexing techniques: http://www.vldb.org/pvldb/vol8/p1668-shukla.pdf.

> Please note that these papers are significantly behind the current state of the implementation. The most crucial aspect that these papers dont cover is the integration of the database engine with the larger distributed system components of Cosmos DB including the resource governance, partition management, and the implementation of replication protocol / consistency models etc.

> Our goal is to publish all of the design specifications including TLA+ specs over time.




The TLA+ specs where eventually published [0], although I wonder how up to date they are to current state of the DB since they haven't been updated for a while.

[0] https://github.com/Azure/azure-cosmos-tla

Hi, I am from Azure Cosmos DB Engineering Team.

The TLA+ specs are fairly up to date.


We used it also. Ran into the issue that collection (?) names are limited to 64 chars(1) (which wasn't documented anywhere), triggering obscure errors while everything was fine from the Azure Frontend.

Also even simple queries using an index (and a uniform document structure) cost a comparatively huge amount of RUs compared to a competitor.

Icing on the cake was the non-support for paging at that time, and a total fuck-up when we tried a restore from backup operation (which can only be done by calling MS support, no automated way possible). If anyone wants to know details I can provide those.

Needless to say we switched to another provider.

(1) the names were generated by our automated deployment step, something like {environment}_{product}_{collection-name}

Hi, I am from Cosmos DB engineering team.

For SQL API, the collection names have a limit of 255 bytes (we will update the documentation). For Mongo API, we honor the limit prescribed by the API for the combination of <database>.<collection> to be 120 bytes as explained here - https://docs.mongodb.com/manual/reference/limits/#Restrictio...

We continue to optimize our indexing layout – there are several key improvements being rolled out. Please share the details of your expensive query (askcosmosdb@microsoft.com) to help us investigate.

OFFSET/LIMIT (Skip/Take) is currently in private preview. It will be broadly available by 5/15/2019.

We are working on customer controlled PITR, which will be available later this year.

Thank you for the feedback!

Any reason you wouldn't use resource groups and db/collections for segregation instead of encoding into the collection name? I believe all of that hirearchy ends up in the resource URI if you really need it encoded there.

CosmosDB does have problems but the comments here seem to mostly be about not understanding the data model. It's a JSON-based document-store that distributes data across partitions.

It offers several interfaces (set at the database level) but that doesn't mean all the functionality is supported. It works well with MongoDB, Cassandra, and Table storage but you won't get relational SQL joins or fast graph search queries in Gremlin.

I agree that marketing and documentation is poor which leads to these misunderstandings. Multi-model is never perfect and people forget that emulation has a cost in performance, price and functionality.

The problem is that their marketing and docs make it sounds like it is the solution for everything.

Legacy Cassandra workload? Use CosmosDB

Graph Database? Use CosmosDB

Legacy MongoDB database? Use CosmosDB

SQL Server database? Use CosmosDB

When the only really valid use case is:

Document Database? Use CosmosDB

Yes, I said exactly that.

Multi-model is a hard problem and only really works with similar data models. It's also confusing since people conflate SQL and relational semantics when it's really just a query language.

It does work well enough if you have an OLTP document use-case like MongoDB, Cassandra, and JSON/SQL.

Disagree about "conflate". SQL is very geared towards relational semantics.

They may be independent in theory (a stretch), but any non-trivial SQL query (say a medium sized one with 50 lines abd 10 tables) is meaningless without a relational DB underneath.

What Cosmos calls "SQL" is not SQL at all, it lacks almost everything, it just borrows a few keywords. If you cant even do inner joins and left joins and so on between different tables, it is NOT anything close to SQL. Joins is sort of the point of SQL.

Cosmosdb still doesn’t support skip and take. And aggregate queries are incredibly inefficient. We’ve had to move a heap of data from cosdb to sql server, our reports ran from 3 hours, to now 5 minutes after moving to SQL.

I may be taking your post at face value, but I suggest if you are using skip, i.e. LINQ Skip(), that you stop. SQL's performance using OFFSET is pretty terrible when you start getting into the 10's of thousands of records. Instead, use seek instead.

Here's some random google result that talks about the concept. There are plenty out there though.


How were you aggregating data? ComosDB is a document-store and not a relational database and wasn't designed for large-scale aggregations.

There are ways to use the Javascript UDFs to aggregate data within a partition but across-partitions is not a good fit for this architecture.

Hi, I am from Azure Cosmos DB engineering team.

We continue to optimize our indexing layout – there are several key improvements being rolled out. Please share the details of your expensive query (askcosmosdb@microsoft.com) to help us investigate.

OFFSET/LIMIT (Skip/Take) is currently in private preview. It will be broadly available by 5/15/2019.

FTA: "say datacenter automatic failover component, in one part of the territory. Getting this component right may take months of your time. But it is OK. You are building a new street in one of the suburbs, and this adds up to the big picture."

Wait, what?! Driving metaphor aside... 'MONTHS of your time' to implement failover? People, this is WHY the cloud was invented, so we didn't have to spend months re-inventing. In AWS and even GCP, this takes a day or less if you know the (well documented easy to use) storage offerings. Seriously reconsider your selection criteria when you start saying things like this, because what I just heard is that my team just told me implementing failover is going to cost $60k. Guess how much easier that made my case to switch to another cloud-native offering? TCO over everything. Even Ballmer would agree with that - he made the same case for Windows, against Linux in the 90's.

He's talking about building the underlying failover mechanism for CosmoDB. For a customer it's easy and automatic, but GCP and AWS and Azure have to build it first.

Surely you're referring to Microsoft's team's time to implement failover to the product offering, not time every customer spends on implementation???

FTA: "if _you_ are an engineer working on a small thing, say datacenter automatic failover component" (emphasis added). Pretty sure he's talking about EVERY customer spending months to turn on fail-over.

The previous poster is correct, I think you are mistaken. It should read like this:

"if you are an engineer at Microsoft working on a small azure component, say datacenter automatic failover component"

At least this is how I read it since the OP works at MS it appears.

But months?

To build a globally distributed database offering as complex as Cosmos DB with multiple SLA requirements? Can you build it faster?

I can't for the life of me think of a compelling reason why regional fallover is something most companies would want in a db. It sounds great, but the reality is: the last time Azure had a regional failure, it also took down other regions. AWS and Google also have similar horror stories. On paper, regions are geographically isolated, but the reality is that in all of these instances of failure there was some "super-region" that has some core global infra that every region relied on, and it went down (AWS and us-east-1, Azure and south central US).

You don't actually want single-provider multi-regional, what you want is multi-provider, like what Anthos is trying to do. But that's a harder sell to CIOs, and Azure checks a lot of boxes on paper despite being just horrible.

The other funny thing about Azure is how they champion the number of regions they have (54), but many of these regions only have one AZ (only 8 have more than one). So when they say that something is multi-region or does regional fallover, its like, "great, that's TABLE STAKES for getting HA on Azure". But with AWS, you have at least two AZs in every region, so its not as big of a deal. GCP is the same way, but there's some language in their docs even AWS poked fun at during Re:Invent last year where they say that AZs "often" have isolated power and networking. Not always, just often. So are they truly isolated?

This isn't about customers using the cloud. His perspective is as an engineer on the Azure team building these databases and other features for you to use.

He's "inventing" the cloud so that you don't have to reinvent, as you state.

Hi, I am from Azure Cosmos DB Engineering Team.

The article is referring to the implementation details of automatic failover mechanisms inside of Cosmos DB service.

For the customers, automatic failover is available as a turnkey capability. Customers do not need to spend any time to implement automatic failover.


Beware of the REST API of Cosmos. We fell into the trap of using it..

1) it is not covered by the latency SLA. This is buried pretty deep...

2) suddenly you get magical responses that are supposed to trigger the client to do something (e.g. sql qyeries with order by requires some algorithm client side). These things are not docoumented anywhere, you have to read the Node or Python client libs..

Using Cosmos mainly requires being on .NET and using the official driver that communicates over a closed undocumented binary protocol. (even the Java driver having the full SLAs and using the binary protocol was launched just weeks ago).

IMO it would have been less misleading if MS had just removed the docs for their REST API. Or at least put up a big warning about it being an undocumented afterthought.

Hi, I am from Cosmos DB engineering team,

Cosmos DB offers Java, Javascript, Python, and .NET drivers. Just like with most databases, it is recommended for applications to use drivers to work with Cosmos DB.

Cosmos DB offers REST API to work with data, documented at [1]. The primary focus of the REST API is SDK developers for platforms where we do not provide drivers yet. Just like most databases, we recommend to use provided drivers where available. This REST API is not intended for broad consumption by the apps. Your point about being more explicit about scenario the REST API is intended for and supported for is well taken. We will improve the documentation.

[1] https://docs.microsoft.com/en-us/rest/api/cosmos-db/

thanks for feedback!

It costs an arm. It's not event mongo 4, it has a subset of mongo 3.6 features : not all aggregations queries are available. Would not recommend.

Hey CosmosDB team in the comments:

Building distributed databases at scale is hard af, few people know what it’s like to run hundreds of thousands of databases. Don’t be discouraged by random negative posts, remember you’re doing something most of the world can’t do.

Thanks for the kind words.

Appreciate it.

CosmosDB used to be called DocumentDB, which now Amazon uses as name for their service too.

Do they guarantee that if a write is done one region, when completed upon return, is available when read in another region?

Yes, CosmosDB lets you choose your preferred consistency level.


The performance impact of different consistency levels are available here.


As you would expect, stronger consistency leads to lesser throughput levels.

Hi, I am from the Azure Cosmos DB engineering team.

All five consistency levels (strong, bounded staleness, session, consistent prefix, and eventual) are available for your Cosmos account, regardless of the number of regions it is associated with.

To answer your specific question, yes, if you choose strong consistency, you will read the latest write in any of the regions associated with your Cosmos database.

If you choose bounded staleness consistency, you can configure the number of versions (k) or time (t) up to which you are willing to tolerate the staleness for your reads in return for the write to get ack’d immediately (after it is quorum committed in the local region).

The documents below describe the consistency levels and tradeoffs: https://docs.microsoft.com/en-us/azure/cosmos-db/consistency... https://docs.microsoft.com/en-us/azure/cosmos-db/consistency...


I do not think so. When I last tried it they only allowed for "bounded staleness" consistency across regions despite supporting strong consistency within a region.

As far as I know, apart from the global distribution, virtually any major database offer the same thing today

Here's a tech overview of Azure Cosmos DB form last year: https://www.youtube.com/watch?v=V_C7DlKVofc

Should help with most of the questions here.

Based on the feedback here, it's hard to imagine anyone choosing Cosmos DB over Spanner or just CockroachDB. I'm not familiar with the AWS equivalent, but it seems like Azure isn't exactly setting the bar high.

Spanner and CockroachDB are distributed relational databases.

CosmosDB is a JSON-based document store. It offers interfaces that emulate the ability to use SQL/Cassandra SQL/Azure Table/Gremlin graphs but this doesn't necessarily add functionality like relational joins.

Really? You can’t figure out why someone might use a Microsoft hosted db over one at google or one they’d have to run themselves?

Azure’s growth rate is amazing. There’s a ton of adoption there that doesn’t have access to spanner, or maybe has a document based data model, where spanner / cockroach doesn’t make any sense

Comments here are rather negative yet service is growing very rapidly so there is a disconnect. I would encourage you to try service and develop own perspective.

Sounds similar to cockroachdb, by ex google devs https://www.cockroachlabs.com/

Cockroachdb is a proper SQL Database, Cosmos is just a Document Store that also has an SQL interface.

Its a Document Store which has a whole bunch of read/write interfaces, including Graph, Cassandra, "SQL" (Document), and Table.

And this is part of the problem. Only the Document interfaces work correctly and performantly.

Can you explain in some precision what you mean by that?

SQL = structured query language. It's just an interface to access data. All relational databases offer it but so do many other non-relational systems. This means using SQL to read/write data is completely separate from having relational functionality like joins.

Sounds petty precise to me already

May I ask that we shouldn't use the term Document Store or Document <anything> for JSONish data as used by Mongo, and by extension AWS "DocumentDB" and Cosmos DB. The term supposedly emphasizes that data graphs are stored as they are posted and requested by simple webapps (as opposed to normalized relational data), but that still hasn't anything to do with documents as understood by most people. It just poisons the search space (and AWS calling their impl DocumentDB doesn't help either).

This was a really interesting write up. Doing a sabbatical with a large distributed systems team sounds really cool.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact