Hacker News new | past | comments | ask | show | jobs | submit login

I tried perkeep a while ago. While the ideas are cool, the implementation is meh:

I added a single 2.7G ubuntu iso - it took 5 minutes to ingest it (on a tmpfs!), and turned it into 45k(!) little chunks, wtf is up with that? At this rate indexing my multiple terabytes of data is going to take it days and I don't even want to think how much seek time it's going to need to store its repo on a spinning HDD.






> I added a single 2.7G ubuntu iso - it took 5 minutes to ingest it (on a tmpfs!), and turned it into 45k(!) little chunks, wtf is up with that?

Ingest times correlate linearly with file sizes because it needs to compute the blobref (which is a configurable hash) for all the blobs (chunks as you call them). Splitting in blobs/chunks is necessary because a stated goal of the project is to have snapshots by default when modifications are done. Doing snapshots/versioning without chunking would be very inefficient.


Reading the docs, snapshotting/versioning doesn't strike me as a major feature of perkeep. It's more important and appropriate in the domain of backup software (e.g. restic/attic/borg) and where you'd want it together with delete functionality to reclaim space.

But perkeep's focus, as I understand it, is more on managing an unstructured collection of immutable things (e.g. photo archive), rather than being a tool to back up your mutable filesystem. So I'm not sure they made a good design decision to chunk the sh*t out of my files, which really kills the performance on large files and especially on spinning disks.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: