Hacker News new | past | comments | ask | show | jobs | submit login

There's a very common data/ETL pattern wherein raw (unencrypted) data is stored into S3 at the very beginning of any pipeline. Adding encryption adds a layer for failure, which can grind your pipeline to a halt.

I've seen a pattern of: drop raw data into an S3 bucket that has a very restrictive policy with a long retention policy. Then, process that data asynchronously (encrypt, transform, filter, etc) and drop it into a different bucket/area that is accessed by other consumers.

Then, if any part of your ETL fails (encryption included), you can fix your bug and reprocess from your raw data without writers seeing any impact.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: