Hacker News new | past | comments | ask | show | jobs | submit login

Right, i don't think anyone is suggesting to use this as an operational data store.

Use it for infrequent/read/analytical/training access patterns. Set you bucket to infrequently accessed mode, partition data and build out a catalogue so you're doing as little listing/scanning/GETting as possible.

Use an operational database for operational interaction patterns. Unload/elt historical/statistical/etc data out to your data warehouse (so the analytical workloads aren't bring down your operational database) as either native or external tables (or a hybrid of both). Cost and speed against this kinda of data is going to be way cheaper than most other options mostly bc this is columnar data with analytical workloads running against it.




> Right, i don't think anyone is suggesting to use this as an operational data store.

Databricks, Snowflake, AWS, Azure, GCP, and numerous cloud-scale databases providers are 100% suggesting people do precisely that (even without realizing it, e.g. Snowflake). It's either critical to their business models or at least some added juice when people pay AWS/GCP $5 per TB scanned. That's why these shit tools and mentality keep showing up.


K, I'm seeing the crux of your contention. Seems to either revolving around the idea of using cloud services in general, or specifically using S3 (or any blob storage) as your data substrate. That's fair.

Out of curiosity, what are you suggesting one should use to store and access vast amounts of data in a cheap and efficient manner?

Regarding using something like Snowflake as your operational database, I'm not sure anyone would do that. Transactional workloads would grind to halt and that's called out all the time in their documentation. The only thing close to such a suggestion probably won't be seen until Snowflake releases their Hybrid (OLTP) tables.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: