Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do you manage your “family data warehouse”?
127 points by coderatlarge on Sept 15, 2023 | hide | past | favorite | 135 comments
By "Family Data Warehouse" I refer to all the info and documents that one tracks over the years, perhaps for one's self, but then also potentially for spouses, kids, parents, siblings, relatives, etc.

Some common categories:

(1) Financial: account statements (pdf, paper, etc), transaction CSV/ofx/etc, insurance policies, financial institution correspondence (confirmations, T&Cs, etc), tax forms, deeds/titles, loans, debts, contemporaneous records, paystubs, employment letters, receipts, etc

(2) Health: test results, diagnoses, medications used, doctor info/correspondence, vaccinations, hospital interactions, statements, bills, etc

(3) Product and services info: products purchased, maintenance info, warranties, replacement parts, recall notices, class action events, professional used (CPAs, lawyers, contractors, plumbers, etc), etc

(4) Personal legal and other documents: wills, trusts, health directives, pre-nups, legal settlements, personal contracts, rental agreements, government IDs, immigration/naturalization docs, law enforcement interactions, etc

(5) Memories and other info of sentimental or other value: pictures, videos, cards, recordings, music, diaries, clippings, personal notes, etc

For many of these one might ask the usual questions:

A. do I have a reason/obligation to retain it? for example, tax authorities often require certain classes of record keeping. sometimes commercial entities make false claims in retrospect that you can only rebut with evidence (even major banks!). some documents can end up only being used by heirs at end-of-life (ex: basis information for assets like a home, which can require proof of improvements over the years, depreciation claimed if the home was ever rented, etc)

B. is there a downside to retaining it? clutter, hard/costly to move, identity/other theft risk, physical decay over time (ex: obsolete storage formats (VHS, mini-DV, blueray)), etc

C. will I (or the appropriate person) be able to find it when I need it? physical paper vs electronic mgmt, indexing, filing, passwords, legacy services for transferring access.

D. can I actually retain it? backups, water damage, fire, theft, physical safe management, etc some online services don't provide any/easy export of data, many have inconsistent retention policies (ex utility companies)

Some of the categories above have specialized software and services dedicated to them (ex: photos services, quickbooks, etc) while some are more ad hoc (goog drive, dropbox, local HDD); even specialized services often fall short on key matters (quickbooks doesn't track statements afaik); many specialized systems have various platform limitations (no turbotax on linux, say); many vendors try to play various lock-in tricks; some vendors go out of business or discontinue products over the years; some institutions are very paper oriented so one perhaps scans or tries to limit paper delivery; some services place data in the cloud creating security exposure; etc etc




In Belgium, every citizen has access to a "digital vault" provided by the government, which is accessible using their ID card. Individuals can upload up to 1GB of documents into this vault. It's a well-designed system that ensures privacy during one's lifetime, but you can choose what happens with it after your passing.

https://www.izimi.be/en/


In Germany we have government entities that are unable/refusing to accept anything but Mail in forms or fax as an official communication channel :/

I’m jealous over countries that have such well designed systems in place to make things easy for their citizens


Same. I was unaware that Belgium or India have such systems and often wished for them to exist in Germany, but thought that it was impossible for privacy reasons in the EU. Now it's clear that it's just incompetence or lack of funds or vision in Germany.


„Datenschutz“ (data protection) is an extremely popular excuse for any failing digital project, in reality, it’s probably a mix between complete incompetence (see digital drivers license), a lack of funds (look at smaller cities/towns) and and lack of capable employees with It backgrounds.

Looking at the situation here it’s really damming that Germany not only has essentially no digital processes implemented, the people in charge seem to are even lacking something as fundamental as a coherent vision how it should work at some point. That the current coalition also decided to essentially cut funds for digitalization projects isn’t making the issue any better


In Germany, the government entities also rarely, if ever, communicate with each other. I'm offered a citizenship and while I'm extremely grateful and even considering the fact that Germany is the country I love the most by a huge margin, I couldn't bring myself to start gathering the documents they need from various government offices. It's been many years, I wonder if I'll get on with it sometime in the future.


In Berlin it’s literally impossible to apply. They stopped accepting applications while waiting for the central processing location to open. You must force your application through with a lawyer. There is a backlog of 30,000 applications. They plan to process less 20,000 in a year if all goes well. It will not go well according to the latest report. Oh and they are about to reduce the citizenship requirements, but have no plan for dealing with the extra workload. Everything is still paper based.

I decided not to become a German citizen. I want no liabilities to this country.


> I decided not to become a German citizen. I want no liabilities to this country.

Sad considering the desperate need for people working in a variety of specialized professions, but a completely understandable reaction. Unfortunately, I don't see a realistic future where this is any better.


I can’t even start to imagine the annoyance that people have to go trough when they need something really important, like a citizenship or a visa.

I haven’t had to deal with that bureaucratic processes, but if I extrapolate the awful experiences I had with something as simple as a new ID, I can fully understand why people hate this kind of organization.


I'm dealing with a bureaucracy which is literally asking me to essentially document every place I've lived and every place I've worked for the last 30+ years. Not only that, but they require paper documents with paper Apostilles affixed and everything notarized. Even if it were possible, it would cost many thousands of dollars in documents and travel fees. In practice, it's essentially impossible because some employers no longer exist, others don't provide paper documents, etc.


Is that for a citizenship application? I’ve moved to Germany around 1.5 years ago and quite excited about the perspective of relatively easy and fast citizenship, but the process you’re describing is insane, I won’t be able to get paper docs from many of my employers.


It’s a massive pain and people regularly leave the country due to unbearable bureaucratic delays.


In Denmark we have e-boks and mit.dk which offer much the same service.

On top of that, every official document, from birth certifica, loans, drivers license, health insurance, and everything else is stored in government databases and can usually be accessed from there, provided your national single sign-on solution, MitID, works.


Similar service is available in India : see digilocker https://en.m.wikipedia.org/wiki/DigiLocker


> It's a well-designed system that ensures privacy during one's lifetime

So is a filing cabinet or safe, ideas that have worked well for 1000s of years. If you really need a digital backup, throw the documents on a $10 flash drive, maybe have two copies of that.

I guess I'm cynical because governments and institutions get hacked all the time.


I had/have filing cabinets. The problem is that over the years they grew into collections of multiple paper boxes with poor indexing and scattered contents (i.e., some docs in a series in one box, other docs in a series in another box). And then I moved again and again for various reasons. And then I ended up living abroad in places where it wasn't easy to bring all the boxes along. It's been a tricky balance :)


India has a similar offering called DigiLocker - https://en.wikipedia.org/wiki/DigiLocker


A legal and secure dead man's switch?


As an American, this is hilarious. The american people would riot in the street over something like this


Some Americans would. Those of us who want the US government to do an excellent job of taking care of the people in the USA (not just citizens) might be inclined to support such a centralized system for storing important personal information. I'm more collectivist after witnessing how the USA handled/handles COVID-19. I wear masks not just to protect myself, but also other people far removed.


I assume you mean that many Americans would consider this overreach by the government.


Are you able to pay to increase your storage amount?


Interesting. TIL


Indeed. Government-provided digital services can be a good experience. There is a lot of initiative in the US to improve the sprawling and incoherent experience. As an example, on the IRS website you can access your personal information (returns, all submitted tax info from various sources) going back for several years. As far as I'm aware this is a newer service.


I have a messy drawer where I dump all of this stuff. It’s basically write-only, once I throw something in there I know I’ll never look for it again.

For digital stuff it’s mostly the same idea. If it’s email (which is almost always the case), I archive and forget about it. If it’s something else, I’ll throw it in my virtual “messy drawer” which is just a folder in Google Drive.

You are getting quite a few clever ideas in this thread, but I suspect the vast majority of people out there in the real world do more or less the same as I do (of course, I’m biased).

My “philosophy” (overstatement) is that 99.99% of the time I’ll never need it. If I ever do, it might be a little painful to dig around, but that doesn’t justify having a whole system in place. I would rather keep my mess.


I used to do a box like this. The "temporal ordering" helped narrow search. But I found that finding things in it was still so difficult, that it wasn't worthwhile every time it'd be useful. There was also trying to find one document, and having to do a second thorough pass when I didn't find it in the first pass, which could take over an hour of misery.

Today, I routinely download and save PDF/OFX/QFX for every financial and insurance statement, as they become available. The few important paper documents I receive, go into a "to-scan" pile, which I try to clear out every few months.

(I really need to set up a scanning appliance/station that takes only seconds to use. I only need a to-scan pile because it currently takes about 15 minutes to set up for scanning and put things away after.)


>set up a scanning appliance/station that takes only seconds to use

Fujitsu ScanSnap is a highly recommended product. I have one on my desk that I bought over ten years ago -- it's slightly larger than a box of facial tissues (kleenex), all I have to do to scan is open the two "flaps", insert document, press a button, then assign a file name and folder (easy once a history list is established). Scans color or B&W, single side or duplex.


I use temporal ordering for all paper financial papers I receive. I used to sort into folders by category, but the took too long. At tax time, I do a linear probe for the date range of the tax year.

Since Apple started providing better end to end encryption for Notes and iCloud, my wife and I just use that. We also have a ton of storage on OneDrive, so periodically I ZIP up important stuff, encrypt it and put on OneDrive.


Kind of a similar approach: one drawer for paperwork, google drive for all files, evernote for all notes with a dedicated notebook for different frequent used stuff like IDs, birth certificates, etc. Use search when I need something digital. Periodically throw away warranties and manuals from the big drawer (every 1-2 years) :)


I use the same system! Never thought of it in those terms though. TIL it's essentially an analog write-only data lake.


Does the analysis change for you if you consider the difficulty of wrangling the important stuff for someone else in your family, should something untimely happen to you?


A good book on dealing with the messy drawer and its contents is "Organise Your Paperwork: From Paper Mess To Paperless". Highly recommended.


A Synology NAS running Portainer (https://www.portainer.io/) running Paperless NGX (https://github.com/paperless-ngx/paperless-ngx)

This works better than I can possibly tell you.

I have an Epson WorkForce ES-580W that I bought when my mother passed away to bulk scan documents and it scans everything, double-sided if required, multi-page PDFs if required, at very high speed and uploads everything to OneDrive, at which point I drag and drop everything into Paperless.

I could, thinking about it, have the scanner email stuff to Paperless. Might investigate that today.

Paperless will OCR it and make it all searchable. This setup is amazing, I love living in the future.


I have Paperless watching a network folder from my NAS. The scanner scans directly into that folder and Paperless picks it up and processes it within a couple seconds. Even if Paperless went down for a bit (which happened before I upgraded my server), the files remain on the shared drive until it comes back online.

Without that, I had paper piling up waiting to be scanned.


IME Paperless is nice as idea, but really buggy... Sometimes search does work for OCRed text but not for labels. Sometimes labels themselves are not shown or need to be written/edited by hand browsing it's storage.

Beside that is a nice idea, but too focused on scanned stuff, today many docs are not scanned nor pdfs.


That’s quite cool, I might add that. I did set up E-Mail watching using an iCloud account and it’s working well too. Must go donate some money to the Paperless team!


I haven't looked at Portainer before, what does it give you that just running a Docker instance on the NAS doesn't do?


Well, a lovely management system, basically. One can run and edit Docker Compose configurations in the browser, see everything that’s running, manage environment variables and see the logs. I mean, having written all that, I’m wondering why I don’t run it on my Mac instead of Docker Desktop.


Generically, Portainer can manage multiple servers at once, each running their own Docker.

For home server usage, I just use it as a frontend to docker-compose files that gives me a pretty UI I can access from a web browser, instead of needing to SSH in to the home server.


All types of documents go into a paperless ngx. All documents are tagged properly so that I can find them again. Also the search worked quite well so far. I’m still archiving the previous years currently, as I set up everything this year.

Backups via restic end up on backblaze. Haven’t gotten around to set up the local backup. This is a good reminder to do so. :)

The photos end up in iCloud and Adobe with a backup in backblaze. Not so happy yet with this solution. But most alternatives I tried weren’t as comfortable.

Edit: The paperless ngx server is a low power machine that runs in my local network.


Same here, paperless changed my life. At my previous place I was also using one of the services where you get assigned an address with a fake unit number and they receive/scan/email all your post. The current country has much less paper mail, so I stopped, but would recommend that approach in general.

I'm also mirroring the family Google photos locally and backing it all up to rsync.net.

Regarding retention - everything gets kept, I don't care about retention rules. Things are tagged so a payslip from 2001 is still there if I ever need it, but I won't see it otherwise. It would cost more to think about it that the few hundred MB that it costs in storage space.


Interesting solution. Looking at the pricing for rsync: do they charge you for data transferred or data at rest? Or maybe I’m misunderstanding their pricing altogether. Also what are you using to actually sync?


You get access to a shell via ssh, https://www.rsync.net/products/platform.html

Looking at, https://duckduckgo.com/?q=site%3Arsync.net%2Fproducts&t=ipho...

You have rclone, sftp, rsync, restic, borg, synology and other.

My understanding is there’s no bandwidth pricing.

(I’m sure they’ll show up and clarify it. Amazing they always seem to do so)


Thanks - this is awesome. I will investigate. Might solve the problem I’ve had forever!


There is, in fact, no charge for usage/transfer/ingress/egress.

We would be very happy to have you.


Safe deposit box at the bank for the physical records. A labeled manila envelope stored locally for each of us with the typical "identity" documents, SSN card, birth certificate, etc.

Everything is stored locally, unencrypted, and backed up on a local removable HDD, and on BackBlaze. BackBlaze has saved my bacon more than once.

As for physical objects, they're all temporarily mine, until I pass on. They're either tools, consumables, or tied to personal memories. Everything else can go.

The one significant thing to pass on to others is all the data in D:\source\yyyy\yyyymmdd*, which is all of my photos, including family events, short videos, etc. As best I can, all the faces have be tagged with XMP metadata. (Nobody wants to keep an old box of photos without knowing how those people are related)

     Total Files Listed:
           392934 File(s) 725,479,895,968 bytes
           16103 Dir(s)  1,790,571,331,584 bytes free
  
  D:\Source>
I need to discuss with my spouse and child what they want from there.


What do you use for face tagging? It's a tedious task, I used to use Picasa but it wasn't XMP and it's deprecated now


I use DigiKam, it is far less crashy than it used to be, but it's not as good as Picassa was.


I scan all documents and paper mail and store the scans in a git annex repo with a few replicas on machines I own or rent. Originals go into binders, ordered by the hash of the document. With a few helper scripts and OCR with Tesseract searching through my documents is surprisingly fast, so is adding new documents at the proper place in the binders. The idea behind ordering by sha1 of the scan was that the binders would be storage only and all categorizing and organizing should be digital, and also that as I add more documents load balancing between the binders is trivial: at the moment I have one binder for hash prefixes 0-7 and one for 8-f, with a divider for each one-hex-digit prefix. If I ever feel there are too many docs per divider I can split further.

Emails go into a plain git repo, and photos go into another git annex repo.

Ideally I’d like to put all of that into Perkeep because I like the idea of just throwing anonymous blobs into a bucket with the possibility to organize and tag them after the fact, but it looks like that project is dead and I would need to add a few things to it to support my use cases.

I store a lot of scans and emails I won’t ever need again. I just consider that storage becomes cheaper faster than my storage needs increase, and that storing extra documents and saving myself the need to decide whether it’s worth keeping, is a good tradeoff.


The biggest “aha!” thing for me has not been any complicated data lake, but rather one simple text file.

In this file I write the date and a short note anytime something important happens. Visit the doctor and get test results? Write a note. Change the oil in my car? Write a note. Get a quote for a mortgage? Write a note. Open a new bank account? Write a note. Shop for new running shoes and learn what size fits best? Write a note. Have a contractor come work on the house? Write a note.

If there are important files that relate to those I list filenames and say where they are, but frankly I haven’t found that to be very useful. Rarely do I need any original documents. And if I died prematurely, my family can just scroll through this document to check off accounts (Google will automatically send it to them if I don’t access it for a period of time).

I’ve been using it for a few years and found it to be a total lifesaver. I frequently just need to quickly see how I did something, who I used, or his much something cost.

I store the live version on Google Drive so I can edit it anywhere, but you could host it locally.


I've been doing the data storage tango for thirty years now, and the one constant is that everything requires maintenance or has to be moved to a different platform and software eventually. Today, everything is in Google Drive and Google Photos; about 600 GB worth of data and media. Maintenance consists of an annual Takeout ordeal with copies going on two different removable drives, one of which is stored in a so-called fireproof bag.

The fireproof bag also has my account/password list (printed out once a year), birth certificates, titles, and passports.

If you are the tech person in your family, think about what happens if you die. Will your family be able to figure out and maintain your system? Mine is far from perfect, but most of my family can figure out Google Drive.


Where do you guys live to need to store birth certificate in a fireproof bag? I can just login into my government website with a digital identity provider and print one legally valid copy any time. Same for any identity related certificate.


Anywhere I've been in the US, getting a certified birth certificate (fancy paper with an embossed seal) will cost some money and take a week or more. The fireproof bag is just a convenient place to keep papers for which a scan is no substitute.


Do you store your login to the website in a fireproof bag?


United States probably.


Documents are in Subversion hosted on a local home Debian server which is also NAS.

Photos (hundreds of GBs) are also in Subversion hosted on same server. Photos are managed/viewed either via Digikam (on both Mac, Linux, Windows) as well as through a generated HTML/js gallery served on the same server.

Home server is backed up via restic to

* local harddrives that are rotated weekly and moved offsite. * to a rented physical Debian server in a different country * to a cloud storage provider

Server storage is fully encrypted using LUKS. Backups are encrypted via restic/ssh.

Various sensitive documents are separately encrypted inside the svn repos via 'age' or similar tools.

This setup has been working for almost two decades. I have yet to lose any data.

Edit: For documents, we have a personal repo for each family member. Typically, we use a folder structure based on https://johnnydecimal.com/

We also have a repo specifically for receiped in which receipts are just stored as /<year>/<yyyy-mm-dd>-<name>.pdf (or .jpg if it is a scan/photo).

Legal documents like wills, etc. are stored in a safe as well as a physical copy stored at a relative + digital versions stored in the svn repo.


Honest question: why do you use a VCS to store the documents? The kind of documents OP is talking about don't get updates, same for the photos.


Great question. The reason is that my documents do get updated. This is not merely an archive as I also store various more live documents in it like budgets, etc. Some documents are indeed purely archived. But even for those, I find it really handy to have a complete log of when they were added. Who added them, etc. All this is stored in SVN. Also, SVN provides a really easy way to sync the folder to various devices (svn CLI tool on Linux and Mac, and TortoiseSVN on Windows clients).

I am aware that nowadays, SVN is not the modern choice compared to Git. When I designed this system, Git was not available. But I actually find that SVN's centralized architecture fits really well for this purpose.

For photos, I really like having them in SVN as I never lose any image even if I modify it. Sure, it takes up a little extra storage. But storage is cheap. Also, SVN here acts as an extra local backup.


At a previous employer, we stored 3Tb of data in SVN. The data was mostly archival, however sometimes corrections were made to historical data. Using partial checkouts it was quite easy to sync a subset locally. I don't use subversion anymore, but thinking about it I prefer the explicit and deliberate commit/checkout/merge over the automagic sync conflict generators that are onedrive, dropbox, nextcloud etc. Also all pf these start to struggle somewhere around 100k files / 1TB data mark.


I've used SVN in a similar capacity for personal and business use, and even at some of my Customers. It's very handy and TortoiseSVN provides a good user experience. I got into doing this back in 2004 and even though Git might be a better tool today (maybe) I've stuck with SVN.


Dropbox for everything except photos, videos and security-related info.

I used to keep photos and videos on an external hard disk with a jwz-style backup mechanism and Lightroom 3 for organizing it (kept the light room database file on the hard disk itself)

Now just gave up and use iCloud Photos. Not as good as Lightroom but can't find anything better that's also easy.

Secure details go into 1Password with a printout of emergency keys in a wardrobe under lock.

I used to use Evernote but moved away from it soon after they started introducing chat and a bunch of crap no one cared about.

I keep copies of often-referenced items from Dropbox in Apple Notes but the master copy is a file in Dropbox.

I'm using the Shortcuts app on the phone to take a receipt from an online payment and file it away in the /receipts folder in Dropbox.

I used to have an IFTTT integration to take all the photos I posted on Facebook and keep a copy of them in a Dropbox folder as well. It probably still works but I don't really post stuff anywhere any more.


This is a fascinating topic that doesn't get the attention it deserves.

The idea of trusting a single entity (the family computer expert, a single company, or a government agency) to be the sole host or custodian of important records and family photos/videos is fraught with risk: Even if that entity has redundancy across multiple geographic locations, they can have security breaches, their software can fail, their business can fail, or they can otherwise become incapacitated.

Security breach risk can be drastically reduced by encrypting your files before you even store them in that entity's system(s). But you'll need to protect that key and keep it in multiple locations that multiple people have access to. And it's going to make certain functions (like search indexing or file format conversions, to battle file format obsolescence) a lot more complicated.

Data loss risk due to failing hardware, software, companies, or people, can be greatly reduced by copying your stuff to multiple storage providers who ultimately rely on physically different data centers (e.g. if company A and B both ultimately use S3, Amazon is now my single point of failure), and taking an "append-only" approach.

But who is going to spend the time to pre-encrypt and sync all of your family's records like that? Possibly the family computer expert, who you can't depend on long-term.

If you're lucky, you have one who will do all of that for now, and maybe give everybody in the family a copy on a durable USB drive or similar* every few years. Or maybe you can pay someone to do the same. I'd pay for it, because it's a lot of work I don't want to do.

* For longer-term archival storage that should last several decades without power, M-Disc (DVDs or Blu-ray discs) used to be a go-to option. But that depends on people having access to a Blu-ray player decades from now, which I suspect will be a lot like the process of getting access to a floppy drive today.


> This is a fascinating topic that doesn't get the attention it deserves.

Agreed. Documents and Photos are the bane of my existence. To the point that I don't like taking photos on trips and events because I know what's to follow.


By points

1. Beancount via org-mode notes, org-babel for transactions, bean queries from the org mode file for reporting, non perfect, a new thing, in the past I've used GNU Cash but I prefer to put anything in org so...

2-5. equally in org-mode org roam managed notes, non textual stuff as org-attachments linked directly in notes or indirectly "elisp:(start-process-shell-command" alike to run different stuff than mere opening.

I have to retain something for fiscal reasons, something for personal reasons and something are extras. It's not that perfectly crafted, but accessing anything by titles (org-roam-node-find) is like having anything in a graph, accessed via comfy search&narrow. When titles fails consult-rg on the org-mode root (~/notes) does the rest.

I've tries org-ql, to query notes for certain infos like "active contracts I have", but even with templating (yasnippet) it's not that immediate, I have some queries and templates but I do not really use if not very rarely. I've try LLama (Khoj) on my notes but well, results are very bad compared to direct textual mach access. So, yes, I have some clutter, I do not see it much, it does not consume significant amount of storage nor pollute my searches so... It's manageable, rarely I found something is the clutter, so it's not totally useless anyway. I am able to find anything noted so far, of course I do not log my life completely so sometimes I'm looking for things I do not have, that's rare but happen. In this cases the answer is "it depend".

I choose to live digitally in Emacs mostly because of integration: I have mails, todos, notes, anything in the same place/tool/model/paradigm like my mind is one for all.

Downsides: it's a personal thing, not really usable by other family members, not designed for collaborative works, it's possible works together but only as ISOLATED individual, all on their own desktop, possibly Emacs but not necessarily and definitively not "the same Emacs" sharing steps/results by various means like patches via mail, direct textual mails "I found this and that, I conclude that" etc.


Really important documents are paper and stored in a filing cabinet into labeled folders. Every so often we go through these and discard the papers we no longer need (old tax records beyond 7 years, utility bills more than 6 months old, etc). If our house burns down or floods badly, we'll potentially lose these documents, and that's a risk we're willing to accept. Pretty much all of these documents are actually replaceable with effort, time, and a little money. I can get new notarized copies of birth certificates, old tax documents, will, passports, etc if I really need to but it would be quite annoying. The risk of losing them and the annoyance even each other out, it's a very small risk. If I die, no one needs to know how to use a computer in order to access my will, taxes, etc.

Important digital files are all on our family computer which is backed up to a cloud service which will mail me a disk to do a restore if needed. This includes all our photos and videos from since we've had digital cameras. If both my wife and I both die, no one will have any clue how to deal with this and that's OK.

Photos we really like are printed and in photo albums organized by date. We'll lose these in a house fire and that's a risk we're willing to accept. The ability to easily pull a physical album and see pictures from a past time grouped together and curated is something we really like.


If you're doing all the data archiving yourself, and only need to make it readable to other family members...

I have a filesystem tree that organizes digital copies of all financial documents/data.

The tree is routinely backed up to rugged USB flash drives, labelmakered "FOR HEIRS", and stored in my safe and bank safe deposit box. The drives are unencrypted, and in FAT32 format (though I use Linux), to increase the likelihood that whomever needs it can access it easily.

At the top of the tree is a one-page `READ-THIS-FIRST.md` (text file using CR-LF encoding, in case they view it on MS Windows). It's also printed as paper copies in the safes. This one-page document tells them about the "FOR HEIRS" USB flash drive, the safe deposit box, GnuCash, and various insurances.

Also in the safes is a one-page printout/PDF from GnuCash, which shows all my assets and liabilities, other than durable goods in the household. (I'm guessing those two one-pagers are most of what heirs would need, at least initially. If they need more later, there's the full GnuCash database, and the neatly named and organized PDF files of things like financial service statements.)

I've recently kludged an easy way to maintain household durable goods inventory, by kludging onto purchase transactions in GnuCash. I still have to find the time to encode all the cruft I've acquired in the past, though. If/once I do, a household inventory might become a third paper copy in the safes (for possible someday insurance or estate purposes).


A few folks have talked about digitizing paper documents, so one of my favorite tricks for iPhone havers:

iPhones have very nice scanning tools built in, but it’s buried in the Notes app. Create a new note, click the camera icon, then “scan documents” and it will create a very nice, usually well cropped and OCRd scan that’s saved as a PDF you can then export elsewhere. Wish it was a standalone app because it works so well, and this is from someone who helps digitize and preserve paper documents for a living.


There is an easier way now!

In the "Files" app, tap the (...) icon in the top right corner and select "Scan Documents".


Long press the notes app icon, and you’ll get a shortcut to scan documents


If you want a standalone app, Microsoft Lens is actually pretty good for creating pdfs from pictures.

I agree it's great to be able to do this natively on iphone without third-party apps.


Oh awesome, didn’t know!


thanks for sharing - it's a great feature.

one problem I've found with notes is that they don't appear to have any kind of export mechanism - i.e., you're tied to apple and that app. Is that right?


Tap the down arrow by the PDF, then click “share” and you can email it, move to Dropbox, whatever. They don’t make it obvious it’s just a regular PDF!


For me and now by extension the missus and kids: An organised chaos of diverse storage, for docs and photos. On a NAS, Google Docs, Dropbox, Flickr, variuos other local photo apps now offline (coppermine etc). All newer photos are on icloud though also backed up offline. A lot of older paper documents in physical folders, some organised but most just bungled together. Old physical photos are hopefully somewhere in the attic.

Most recent years all letters are however photographed on a phone and put into relevant shared photoalbums, then shredded. Albums by year and category e.g. "School docs", "Kids School work", "kids art", bills, statements, manuals and instructions, receipts, etc. Still have a big chore to go through all the old folders to scan and shred so we can really be paperless.

A few times I have had to dig out old bank statements or proof of address on a recent bill. However they all just wanted a photo of it, so storing it digitally was fine.


I'm a bit shocked that all answers so far use cloud storage and not a single one proposes offline.

I store everything in the cloud, because it is unbeatably convenient except for the very things mentioned: health, financial and legal. I'm not 100% strict either, because if a document has been sent or received electronically or can be assumed to be stored in the cloud anyways I really can't do much. I still assume that every online storage will be breached one day and keep some documents deliberately offline.

To give you some chills, check out "Who's been pwned" [1] and this is only the tip of the iceberg that has surfaced and that fulfills HIBP's inclusion criteria.

[1] https://haveibeenpwned.com/PwnedWebsites


> I'm a bit shocked that all answers so far use cloud storage and not a single one proposes offline.

Huh? I think I made the second or third comment in this thread outlining my solution which is offline-first:

https://news.ycombinator.com/item?id=37520972


besides the security aspect (which I agree is major and often unavoidable), local storage has numerous other advantages, among which: (1) search can be dramatically and noticeably faster, (2) access cannot be removed/degraded when traveling to countries in which various online services are not permitted.


Fairly simple here. Everything goes in iCloud with offline backups via Time machine. Documents are scanned and disposed of where possible - collecting paper is a bad bad bad thing. Identity and required paper documents are kept in a locked drawer. Credentials are stored in Keepass. Photos and videos are all in Apple Photos. Instructions are left on how to deal with the offline backup should anyone need it.

Obligation wise it's good to keep all documents for whatever the longest record keeping period is in your region.

Important that you scan your identity documents and keep copies too. That has been incredibly useful when I'm abroad because a lot of people will retain passports etc if you go in a hospital or something so you can just show them the electronic copy and tell them you left it in the hotel.


Apple solves this problem well! Note that I use iCloud family which allows me to share the storage with my family. But for going down the list:

A) Mint for most financials. iCloud for storage of documents.

B) Apple Health for keeping track of my records. For ad-hoc stuff I use iCloud to store the documents.

C) Product and service info I use iCloud for. Logins and credentials to them are stored in my family's 1Password vault.

D) All legal documents stored on iCloud.

E) For photos and videos I use iCloud Shared Albums. It's unlimited free online storage of photos. Otherwise it's a mix of iCloud storage and the Notes app on Apple devices.

Obviously you can tell I'm deep in the iCloud ecosystem but personally I feel it's worth it. The synchronization and ease of use for my family is well worth it. For what it's worth I'm on the 200GB plan.


What do you do for recoverable backups? I also use iCloud for everything important but haven't figured out a good, automated system for full backups.


I guess the first question that comes to mind is what exactly are you trying to back up that can't work with iCloud?


thanks for sharing - for Mint, are you willing to type in your password to provide access to your other services/accounts? that has been a security bridge too far for me to date :) also, are there runners-up to Mint?


You give them read only access, the financial API ecosystem is extremely well defined and provides advanced access control. I'm pretty sure you just log in to your bank account and they establish read only access. I would be shocked if Mint stored these logins, I don't think they can do that.


do you use syncing? because if so, then i thought this wasn't a good backup strategy?

for example, my photos could get deleted on my iphone accidentally and that change would sync with all my other devices and that data would be lost forever.


I guess that is a risk I'm accepting. For what it's worth, for photos specifically, Apple keeps them in a deleted folder for up to 30 days. But I'm pretty careful with my digital content and I would much rather them be synced.


that's brave! i don't want any margin of error for my backups.

what happens if all your files get deleted from your computer and then they're deleted from your trash can in the same instance?


I separate my data into two categories, fast (and small) and slow (but big). Both types are important, but separation makes management much easier.

Fast data:

    - documents, pdfs, config files, notes txt etc.
    - these files are synced directly, on-site and off-site
Slow data:

    - archives, images, music, large backups etc.
    - these files are transfered to off-site archive on a per-need basis
    - they are made available in the (family) cloud as read-only
    - they are made available to selected internal services read-only (music service, photo service etc.)
Tech stack:

    - I self-host
    - I follow best practices (zfs, 3-2-1 rule, vpn-only)
    - currently I use Nextcloud, rsync and borgmatic


if you're willing share: very roughly, how long did it take to setup and how long does it take to run incrementally?


I (deliberately) decided in 2017 to make myself (and my family) independent from public cloud providers and SAS. The initial part was quite a bit of work, e.g. Proxmox hypervisor hardware rack setup etc, but I also learned a lot and I must say, every step felt like a tiny relief of independence.

Since about 2-3 years, it is a more gradual progression. I only need about 1-2 hours monthly for administration work, since I automated many of these things:

- Nextcloud (like most of my services) is setup in Docker/compose and updated using Watchtower, except for major updates (e.g. 25.x.x to 26.x.x), where I want to be present

- I update my VMs (LXC) using Ansible; it is a single command that is started daily on my work laptop that updates all 12 VMs

- Backups are automated for the "fast" data category: A maintenance VM starts up my remote offsite backup though IPSEC, unlocks LUKS volume on a NAS and starts the Borgmatic job; after it is done, the LUKS is locked and the NAS shutdown (I plan to replace this with ZFS, too, which will further simplify things)

- The slow data (rync) backup is still done manually, I login every 1-2 month and copy & paste the command. I could automate this, too; but I want to be present here. Also, since ZFS and automated snapshots, I don't have as much need for a higher frequency of backups anymore.

- Once a year, all my data is backed up to external drives that I need to manually connect and then do a `zfs send`, to update snapshots.

- phones, laptops etc. of the whole family sync their content (photos, contacts, files) to Nextcloud; phone stuff is automatically removed locally from devices, once synced to the server; from there they're archived and backed up

If you ask how long the actual backup jobs require:

- I have only slow bandwidth to my offsite backup (5000kbit)

- borgmatic is still very fast, about 2-3 hours weekly

- slow data (rsync) usually takes 1-2 days every month (about 80GB on average change here; for about 5 family members)


My approach is to look at contents as data-first with tools on the top. A few simple patterns of file systems manage this. Right now, I use Dropbox[1] as a primary container, with a family’s Google Workspace[2] account to augment when it needs spreadsheets, documents, forms, etc. I use Insync[3] to convert and back up a copy in open formats for all documents from Google Workspace.

I have been doing this for quite a while for our family, but I have extended slowly to immediate relatives while we helped them out with technology-related tasks. I haven’t reached the stage of complexities with the files and folder hierarchy that might break cloud providers[4].

Our family has a complete offline copy of everything, including backups of the photos, videos, and others. Everything is still in the works, and I recently started digitizing some old pictures from the in-laws. I stumbled upon some of their photos from around India’s independence and played with the colorization and animation thingy (that popped up a few months back). It became a craze, and their super large family gathered for a presentation, making many elders cry with tears of joy and disbelief.

My most significant achievements, compliments, and moments of pride have been that almost all of the new generations in the family choose to study computer engineering or computer-related. And so far, the best one is the niece commenting, “When I grow up, I want to marry a boy who is as technical as aunty’s husband.”

I have copies of the commercial ones (companies set up, wound down) since their inception. I also try to stick to plain-text, and as open formats as possible.

1. https://www.dropbox.com/

2. https://workspace.google.com

3. https://www.insynchq.com

4. https://news.ycombinator.com/item?id=37507148


thanks for sharing - what do you use for colorization?



Having built complicated storage systems during work hours I had a strong urge to build a home NAS. What stops me is that I have this Documents directory on my laptop, which was salvaged from another disk that failed about three years ago, and served for 11 years prior to that, and some documents were copied to it from a PC I owned before then... and going farther back doesn't really make sense as none of the documents would be really relevant / authorities at the time couldn't deal with digital copies of documents, so, storing them made no sense.

So, in the end, I just keep them in the Documents directory. Some, most commonly used are also in my email. That's it.


I've personally been asked by authorities to produce documents that are 30+ years old. It's a bit surreal and I wonder if they could even authenticate them...


I only have a handful of documents that are about that age: my birth certificate and high-school graduation diploma.

Here's an interesting aspect of these: I did lose my college papers a while ago, and when I asked my college if they can produce a copy, they told me that they keep their records for just seven years. So, they had nothing. The high-school I went to doesn't exist anymore. So, if I had to recover my high-school-related papers, I'd have none. But, if I really had to, I could recover my birth certificate, as those don't seem to have an expiration date.

Anyways. In terms of storing digital copies, very little of that is relevant. Even ten years ago most authorities wouldn't be able to accept digital copies. Even today, the municipality of the city I live in won't take digital copies. There's no process through which I could have a digital copy they would recognize.

So, I need to keep hard copies. And the digital copies? -- well, they are perishable. I keep them for convenience, because every now and then I run into some situation where they are accepted. But losing them won't be a big deal as I can always make more of those. So, I don't need any exceptionally reliable storage requirement for the digital documents. It's more like a cache, which can be eventually retired.


Any backup? Curious how you do it


I have to keep hard copies anyways. Plenty of places don't take digital copies and have no process through which they would. So, digital documents are only a convenience for the rare situations when someone will take them. And if I lose them -- well, I'll just take another picture.


Many years ago I read of an organizational practice that I refer to as a "trash buffer".

I have a fairly deep paper bin that I lay papers flat in. Any marginal paper documents or receipts that aren't obviously garbage and aren't obviously valuable go into this bin, always placed on the top.

Every 10-12 months I pick up the stack and remove, then shred, the bottom 1/4 of it - without review or remorse. If I didn't need them by then I never would.

I find that I actually use this buffer:

Every month or two there actually is some marginal paper document that I need to shuffle through and find. It's in reverse chronological order so that is very simple.


makes sense - but note: especially with receipts, there are use-cases where a tax authority might come back 7+ years later, or where heirs might need the proof after your passing (!) - and if they don't have it, it might cost them very large amounts of money in taxes.


I use https://www.trustworthy.com/ and Dropbox and am very happy with this combination.


+1 for Trustworthy, it’s a really fantastic product. The app is well designed, constantly improving and the founders really seem to understand the problem they are solving.


thanks for sharing Trustworthy - I'll check it out.

if you're willing to share: do you worry about their security or do you feel that they're at least as secure as what you could do yourself or other services might do?


A Fuji Snapscan S1500 + folder of PDFs on my laptop + full text search. Mylio for photos. I sometimes purge and file, but not very often.

There's always a bit too much that is attached to emails, I try to remember to save stuff out.

A Synology NAS in the attic which backs up the above.

No encryption.

An £50 "fireproof" box for passports, citizenship certs, things where the original is important.

I assembled a ridiculous old server to do tape backups but couldn't make a very practical system - I wish I could attach an LTO drive to something smaller!


>A Fuji Snapscan S1500

That's "ScanSnap", not Snapscan. :-) I posted about how handy these are before I saw your comment.

>There's always a bit too much that is attached to emails, I try to remember to save stuff out.

Same here. In fact, I always either save or delete attachments, I never leave them in my Thunderbird mail store.


We scan every document and put it in a folder synced with SyncThing. We throw the original away unless it is something official like a birth certificate.

We use SyncThing because it is free, doesn't require user accounts, and works across Mac, Windows, Linux, iOS (Möbius) and Android.

We have a bunch of high level folders (eg. Health, School, etc). Inside that folder every document is named starting with a date, eg: "2023-09-11 Invoice Dentist Kid1.pdf".

So far I've always found every document.


thanks for pointing to SyncThing - I'll check it out.

if you're willing to share:

1. what scanner do you use?

2. what do you do for documents that might be in two places? ex: car insurance, health insurance - in the insurance folder or the health folder? ex: medications that both spouses take, in his folder or her folder or in a "both" folder?


1) We have a HP all-in-one laser printer/scanner. I wouldn't recommend it, but it's good enough for documents. The one feature I like about it is that it you can scan to a USB stick, which is nice because the scanner software on both Mac and Windows is often unreliable.

2) We have very few categories, so it's usually clear where something goes. If not, I just pick a random folder, usually I search with Spotlight (full text search) anyway.


I'm very grateful to everyone for sharing their thoughts and suggestions on a topic that is near-and-dear to my heart.

Does anyone have access to an AI into which they could copy-paste this whole page and have it extract a ranked list of the products that people mentioned, perhaps along with links and counts of people who mentioned them?

The page size exceeds the limit of what free gpt3.5 will do - so, just asking before trying to do it chunk by chunk.


Fortunately for me, most of the official documents here are digitized, and stored on government servers, so in the event of my death, my family can request access to relevant documents. That takes care of most of the official stuff. Also, everything legal i cosigned by my wife, so either one of us has equal access to it, but like official documents, these documents are also stored in government databases.

As for products, i have aimed to buy services that are easily transferrable, i.e. we use iCloud family sharing, meaning every user has their own storage, but they piggy back on my paid subscription. If/when i stop paying, they can chose to continue paying by themselves, and things will just "magically" keep working, but otherwise it's a tricky game.

As for memories, i store everything in the cloud. Most is encrypted by Cryptomator. I have a small server at home synchronizing data home in real time, and the server backs up data hourly to a local target, and a couple of times per day to another cloud.

I have thought long and hard about decentralizing this to let each users laptop do this, but with a family photo album of 3.5TB that would mean i lose deduplication.

It would also put much more demand on the client hardware, as it would need to be able to hold the entire photo library to back it up. I have LONG REQUESTED a way to backup iCloud data without the need to keep it all local, i.e. download upon use for backup, and delete again, but sadly that's not how it works. If your data is "cloud only", your backup will only backup file pointers, and not the actual photos.

Once every year i make Blu-Ray M-disc archives of the previous years photos, and to some lesser extent also documents. I make identical copies and store them in geographically separate locations. I use no encryption or compression on these archive discs. I also have a set of identical external USB drives that i update with the entire archive yearly and then rotate between locations.

I am fully aware that optical media may not be around for much longer, but that's a problem for future me. For now it is the best and most affordable "long term, no maintenance" storage technology available to consumers. Until an official "end of life" is reported, i will keep using it. Once it's EOL i can easily migrate my archive to "something more better", even if that ends up being just ordering photo copies.


I too am in need of something that will let me archive photos without requiring a local copy. I don't understand why that's not easier to find. What do professionals do when they have terabytes of photo or video data, separate it out into disks and backup each individual disk as its own entity?


one approach might be to use multiple icloud accounts to shard the photos data by what you want to sync locally vs not (for example by set of years, i.e., myname1980to1990, myname1991to2000, etc). it's an added pain no doubt.


I live in a paperful country, and I have no scanner.

Important stuff has a photo taken, and it's likely backed up in my photo archives (Syncthing → Pi Storage → NAS backup).

My girlfriend still uses google and whatsapp, and I tend to send her important media in our chats, so we'd both have to be kicked out of our gmail and whatsapp accounts to truly lose everything there.


I did something like this for a long time.

One of the tools I found useful is Microsoft Lens mobile app: it's almost as quick as taking a picture of a doc, but it makes it easy to create a pdf of all the docs and bundle up multiple pages.

Over the years, I have lost access of important google accounts. In one occasion, I had a very old google voice number which I used from time to time. At an unexpected time, Google sent me an email asking me to validate that I still needed it within 30 days. At the time, I was traveling and unable to check email for complicated reasons. I ended up losing the Google voice number, which was the 2FA for a number of accounts, which then became irrevocably lost since the number was re-assigned.


Google photos for photos and videos, I use storage saver which reduces quality but whatever.

Everything else goes into Google drive. I use the Google drive app on Android to scan my physical documents, and I pay for the 200GB plan which is enough for all my stuff.

I've been hoping to write a script to backup stuff into an S3 bucket. I just never have the time.


> I've been hoping to write a script to backup stuff into an S3 bucket. I just never have the time.

Here it is [1]. Funny thing about backups is everyone needs something similar, yet everyone ends up rolling out their own tool set. Backblaze turns out much cheaper than S3.

[1] https://github.com/undebuggable/backup-upload


A folder on my desktop named “stuff”.

On a more serious note I use MS OneDrive family account and iCloud - I figure if either of those companies goes under there are bigger things to worry about. I print a lot of my photos so I’ve always got hard copies of the kids when they were growing up - everything else is unimportant or replaceable.


I've gotten as far as buying the photo printer but picking out what to print out of the flood of photos - and then where to keep the paper - has kept me from actually printing :)


I wrote about this recently, mostly through the lens of the sentimentally important things like videos and photos https://joshvince.site/blog/20230903_little_metal_boxes


thanks for sharing - physical objects are definitely big memory unlockers. Books are big ones for me.

Do you ever index your metal box by taking pictures of its contents so you have them accessible and not forget anything in there?


More and more, I'm finding myself using Notion to track "life data stuff". I realize this is a very imperfect choice, but it offers the best tooling I've found so far for this sort of thing. I may migrate to some self-hosted open-source Notion clone at some point (there are a few).


if you're willing to share further: what's your top OSS clone atm?


owncloud, offline. will back it up once a year or so as well. thought about sending it to Veracrypt and sticking it on a cloud somewhere just in case, but I dont think I have to.

Google Stack (no backups enabled) is great for paper > PDF, then I use a tool for Android called FolderSync Pro to send everything on my phone to a folder on my local network. Once a month or so I organize it. Thought about automating this a bit more but I hate having to spend time while going about my day naming/tagging files.

as far as retaining things, inside each folder I just make an "archive" folder where I stick things that get to be too old. definitely don't overthink the system or it'll turn into a full time job :-)


TrueNAS Scale (formerly FreeNAS) server with 6 x 4TB ZFS storage in a Raidz2 (or was it z3?) configuration.

I rsync my picture and document library from my MBP to the NAS. The MBP also has a Backblaze backup. 2-4 times a year I sync my pics, docs, and backups to a 6TB drive that’s offline.


I use paperless-ngx with 3-2-1 backup. It runs OCR and indexes them to makes everything searchable. I can go back years later and search for “toyota” find my car warranty agreement in an instance.


how active is paperless-ngx? it seems very much oriented in the direction I'd like to go. I once imagined setting up a camera rig over my desk so that any paper that I looked at got auto-captured and automatically archived without any explicit effort on my part :)


Paperless-ngx is the current active project. It is very active. It was forked off of paperless-ng when melendez became inactive. Which is also forked off of the original paperless when it became inactive.


I have a folder where I store everything. All documents are named starting with the ISO date, and I make sure the filename contains the relevant keywords. All paper documents are scanned.

That’s it.


Apple solves this problem for me entirely. No need to tag documents etc, just iCloud everything -> everything is searchable.


if you're willing to share: does this mean you trust apple or that other options are just worse?


I don't trust them in an emotional way, but I think that our interests are aligned.


OneDrive and a fire resistant box. I suppose I actually store far more in my emails than I should!


If you're willing to share: "more than you should" in the sense that it could fall in the hands of identity thieves? or some other reason?


I have the only copy of important correspondence in my web email. If that email is lost, or I simply cannot find it, I have lost the correspondence. I should at least have an offline copy.


Laptop documents folder and outlook inbox


If you're willing to share: curious how you structure your folders (all chronological? by entity producing the document? by your own use?). And also whether you periodically export your outlook inbox and, if so, what you do with the export.


Sort by date modified. A few things stashed in dedicated sub folders like tax returns. No exports.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: