
Azure Data Lake Store: a hyperscale distributed file service for analytics - mpweiher
https://blog.acolyer.org/2017/07/04/azure-data-lake-store-a-hyperscale-distributed-file-service-for-big-data-analytics/
======
theatraine
I'm a data scientist at Microsoft and use the precursor to Azure Data Lake
(internally called Cosmos) daily. It's a remarkably powerful and well-designed
system.

~~~
jchanimal
Is the Cosmos you reference related to [https://azure.microsoft.com/en-
us/services/cosmos-db/](https://azure.microsoft.com/en-us/services/cosmos-db/)

~~~
bjg
No... it's really confusing :)

------
sbarre
Serious question: is a "data lake" just a really really really scalable
filesystem?

I see it described in many different ways but ultimately that's what it sounds
like to me.

Obviously I'm sure there's massive effort involved under the hood in
presenting it as "just a filesystem" to the end-user (or end-process), but am
I missing anything else?

~~~
tomnipotent
A "data lake" just means you've extracted and consolidated data from external
systems and left it as-is - warts and all. Upstream data
warehouses/marts/cubes etc. would be built from the data lake.

~~~
sbarre
Yeah I understand what it is conceptually, I guess I was more curious about
what it actually implements.. Sorry I wasn't clear about that.

------
viblo
Could someone explain the differences between Azure Data Lake and Azure Blob
Storage?

