Hacker News new | past | comments | ask | show | jobs | submit login

Backblaze is cheap, but if you're uploading millions og files, beware -- there is no way to just nuke/empty a bucket with a click of a button. If you're not keeping filename references in an external database, you are left to sequentially scan and remove files in batches of 1000 in a single thread.

Support could not help, and it took me months to empty a bucket that way.




That doesn't really make any sense, Backblaze were not limiting you to a single thread - you were...


You do need access to an index/DB of all files in a bucket in order to delete them in parallel. Otherwise you're stuck paginating with the B2 API.


You need a DB of all of the dead entries that need to be deleted, and that’s a fine thing to have.

There are lots of problem spaces where deletion is expensive and so is time shifted not to align with peak system load. Some sort of reaper goes around tidying up as it can.

But I think by far my favorite variant is amortizing deletes across creates. Every call to create a new record pays the cost of deleting N records (if N are available). This keeps you from exhausting your resource, but also keeps read operations fast. And the average and minimum create time is more representative of the actual costs incurred.

Variants of this show up in real-time systems.


My case was really simple. I was done with my ML pipeline and nuked the database, but pics in B2 remained with no quick way to get rid of them and/or to stop the recurring credit card charges.

IMO an "Empty" button should have been implemented by Backblaze.


Would this technique have been faster?

A single pass: paginating through all entries in the bucket without deletion, just to build up your index of files. And then using that index to delete objects in parallel.


I believe S3 is the same way.


S3 has an "Empty bucket" button, unlike B2.


Disclaimer: I work at Backblaze.

> no way to empty a bucket.

Backblaze currently recommends you do this by writing a “Lifecycle rule” to hide/delete all files in the bucket, then let Backblaze empty the bucket for you on the server side in 24 hours: https://www.backblaze.com/b2/docs/lifecycle_rules.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: