Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Do you use S3/R2 as a data lake?
2 points by greatNespresso on Dec 10, 2022 | hide | past | favorite | 3 comments
Curious to learn HN's opinion, what is your number 1 use case? What kind of queries do you perform?


I use it (and GCS) as staging storage. When we bring in data it often ends up on the block store before it gets stuffed in BQ or Redshift or whatever Spark alike we are using at the time.


Thanks a lot! If I may ask, how do you prepare the data to be ingested in BQ, Redshift or the likes?


Well - formatting, cleaning transformation to be in the right schema, perhaps enrichment




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: