
Ask HN: Are there any open-source projects for long-term personal archive? - rjegundo
If take a long-term view on things, web applications and platforms tend to have a temporary role on our lives but end up keeping many parts of us (photos, text, video, quantified-self types of data, etc). These can easily get left behind and lost when those services arrive at the end of their incredible journey.<p>Even if I&#x27;m not thinking of deleting all my social accounts (right now...), I was wandering if there are any solutions that one could self-host as personal archive for the long-haul, in the same way our parents kept boxes of memories. I&#x27;m thinking this would probably be some sort of quite simple but reliable service (so it&#x27;s easy to maintain in the long-term) that we could connect to existing platforms and fetch our memories from there.<p>I think this is something I would like having for myself and I guess that I&#x27;m not the only one. Do you know any open-source project or other solutions that fit this pattern?
======
mobitar
This isn't a full solution yet, but aims to be: Standard File [1].

Standard Notes [2], which aims for longevity above all else, is built on top
of Standard File.

[1] [https://standardfile.org](https://standardfile.org)

[2] [https://standardnotes.org](https://standardnotes.org)

------
simplehuman
Maybe cloudron.io fits the bill. You apps,data are part of your history.

------
spoonie
Git-Annex may be of help for you to store and backup files.

------
type0
Camlistore

------
hackuser
This is a question I sometimes wonder about, but haven't found an answer.
Libraries deal with this issue, sometimes called digital preservation. A data
dump is below; I've collected the info with the idea of looking into it later.

\----

RESOURCES

Here are a few resources I've come across, planning to look into it in detail
in the future. Unfortunately, most resources seem to be written for experts in
institutions, who have different needs and resources than you and I:

* Library of Congress: [http://digitalpreservation.gov/](http://digitalpreservation.gov/)

* APARSEN (Europe): [http://www.alliancepermanentaccess.org/index.php/about-apars...](http://www.alliancepermanentaccess.org/index.php/about-aparsen/aparsen-deliverables/)

* British Library: [http://www.bl.uk/aboutus/stratpolprog/collectioncare/digital...](http://www.bl.uk/aboutus/stratpolprog/collectioncare/digitalpreservation/)

* OCLC, a major library (consortium? association?), should be a good resource.

* The Internet Archive might be a good resource.

* OAIS (Open Archival Information System) is a solution with at least some institutional users or interest, including NASA and OCLC.

* Article: A balancing act: The ideal and the realistic in developing Dryad’s preservation policy by Sara Mannheimer in First Monday: Dryad is a scientific data long-term repository: [http://firstmonday.org/ojs/index.php/fm/article/view/5415/41...](http://firstmonday.org/ojs/index.php/fm/article/view/5415/4105)

* Article: "Your Personal Archiving Project: Where Do You Start?" in a Library of Congress blog: [https://blogs.loc.gov/thesignal/2016/05/how-to-begin-a-perso...](https://blogs.loc.gov/thesignal/2016/05/how-to-begin-a-personal-archiving-project/)

* Book: Moving Theory into Practice: Digital Imaging for Libraries and Archives Research Libraries Group, 2000: Apparently well-respected book; widely referenced in my brief searches. Outdated?

* HN discussion: [https://news.ycombinator.com/item?id=7842629](https://news.ycombinator.com/item?id=7842629)

\----

FORMATS

Let's start with: What formats are recommended for long-term preservation?
Some specs would be:

* 100% fidelity: But to what? The same Word document can appear differently on different current systems

* Compatibility with systems 50-100 years from now

* Metadata handling

* Organization: Relationships between different objects, e.g., photos from the same vacation, maintained

* Solution for dynamic or interactive data

\----

SYSTEMS

Some specs for the preservation system:

* Periodic fidelity and functionality check: Don't just turn on the system 25 years later and think it will work and the data is uncorrupted.

* Redundancy

* Compatibility with systems 100 years from now, likely including a reliable upgrade path.

* Migration path to new hardware as old hardware becomes unavailable

* Availability after the owner dies

* Confidentiality: Your whole life will be there, and consider that other people's personal information likely will also be in the archive

~~~
Spooky23
Don't forget - curation.

Librarians don't just hoard stuff.

~~~
hackuser
Great point.

