The problem I have always had when building elaborate home server setups is the "set it and forget it" nature of the systems I've installed bites me in the ass. Since it's not my full-time job to manage these systems, I'm really not familiar with them the way I might be with the systems I manage at work. These systems cruise along for years, and when something finally does go belly-up, I can't remember how I set it up in the first place. Now I have a giant chore looming over me, ruining a perfectly good weekend.
These days, I design everything for home with extreme simplicity coupled with detailed documentation on how I set things up.
Docker has helped tremendously, since you can essentially use an out-of-the-box Linux distro with docker installed, and you don't really have to install anything else on the hardware. Then if at all possible, I use standard docker images provided by the software developer with no modifications (maybe some small tweaks in a docker-compose file to map to local resources).
Anyway, my advice is to keep the number of customizations to a bare minimum, minimize the number of moving parts in your home solutions, document everything you do (starting with installing the OS all the way through configuring your applications), capture as much of the configuration as you can in declarative formats (like docker compose files), back up all your data, and just as importantly, back up every single configuration file.
The author focuses the entire blog post on remote third party services that are alternatives to popular third party services financed by data collection as a "business model". IMO, the single most important component of a home network is not any piece of the hardware/software outside the home that the third parties may control, it is the internet gateway in the home. Routers were the most important computers at the dawn of the internet, and IMO they still are the most important computers today. If the internet gateway in the home is ignored as a point of control,^1 then IMO all bets are off.
A significant amount of data collection by third parties can be eliminated or reduced by retaining control over the internet gateway. Arguably this amount is even greater than what can be affected by simply switching to using carefully selected alternative third parties. IMO, it is a mistake to believe that one can reliably eliminate/reduce data collection simply by choosing the "right" third parties. Whack-A-Mole, cat-and-mouse, whatever the term we use, this is a game the user cannot win. Third parties providing "services" over the internet are outside the user's control. For worse not better, they are subjected to market forces that drive them to collect as much user data as they can get away with.
Regardless of these privacy-destructive market forces, it is still possible to build decent routers from BSD project source code and inexpensive hardware. IMO, this is time well spent.
Most of the data passing it are encrypted: https, SSH.
Cutting off the phone-home requests is best done on respective devices: you can run firewalls on most desktops and laptops, and even phones. Рhones often go online via GSM or LTE, without passing through the home router.
While a proxy like pihole can be helpful sometimes, cutting off tracking and ads is done best by browser extensions and by using open-source clients, where available.
The best the home router should do is to not be vulnerable to exploits, and otherwise up-to-date, and fully under the owner's control That's why my home router runs openwrt.
"... cutting off tracking and ads is best done by browser extensions ..."
What if the browser vendor, who is also a data collector, requires user to log in or otherwise identify herself before she can use extensions.
A home "gateway" is a computer running a kernel with IP forwarding enabled that is being used as the point of egress from the home network to the internet. That is a broad definition and allows for much creativity. That is what I mean by the term "gateway". As such, a gateway can, both in theory and in practice, do anything/nothing that "desktops and laptops, and even phones" can do. Relying solely on pre-configured "limited/special purpose" OS projects as a replacement for DIY and creativity in setting up a gateway was not what I had in mind, but is certainly an option amongst many others.
Don't use such a browser then! Firefox is pretty good.
An on-device firewall can firewall individual processes and applications. An upstream / gateway firewall does not have such fine-grained control. That was my point.
Running stuff on my router is entirely possible, but I limit it to routing and running a wireguard endpoint. I prefer to run my private stuff in the confines of the home LAN.
I use a text-only browser for reading HTML. Many times I do even use a browser for making HTTP requests.
Truly one can use all these strategies, application-based, router-based, gateway-based, if they are available. They are not mutually exclusive. Personally I just would not feel like I can rely on extensions or other solutions tied to some software I do not compile myself. (I do edit and compile the text-only browser.)
All due respect to Firefox, but I have found compiling it is way too time and resource-intensive. It is way beyond what I need for recreational web use. Firefox users seem to rely on Mozilla to do the right things on their behalf. That is not the sort of "control by the user" I am after.
What is behind Mozilla. Online advertising money. Cannot really count on them to do what I want.
For me it has not. But it all depends on what sites one is interested in. Generally I do not tend to find much value in SPAs when I come across them. If there is some data accessed through the page that I really am interested in, I just fnd the endpoint and bypass the Javascript. Most sites posted to HN are not SPAs, do not require Javascript to read, and work perfectly for me. I can glean the information no problem, and fast. However I am interesting reading text not viewing graphics. Not every user has the same sensibilities.
I have always aimed to try them all and I will always try any new one I become aware of. I used Lynx from 1993 through the early 2000's, but since then I prefer Links from Charles University in Prague, with no graphics. It has the best rendering of HTML tables of any text-only browser, IMO. It has been the most stable for me at run-time and it is the source code I am most comfortable with, compiles quickly. There is a recent fork of Elinks 4.x someone has started that I have been watching. (Elinks is a fork of Links.) Not sure if it has been posted to HN yet. Currently it crashes too easily but some of the features he has added are good ideas, IMO.
Okay. How can we fix this? I'm dealing with it right now and this space is so hard -- likely somewhat deliberately so. I'm a 20+ year Linux user trying to get a single home network with multiple ISPs going and it just seems way harder than it ought to be; i.e. -- not that every bit of software needs to be idiot-proof, but this iptables/pfSense/netplan etc etc universe just feels downright hostile to the aspiring home user.
Multi-wan is easier with appliances. I used pfSense over the last 12 years or so with multi-wan on and off (currently off). I've run pfSense in a kvm VM, and you can do multi-wan with this. Though I generally recommend dedicated NICs for the WANs and LAN.
I've looked at the linux based appliances (as late as last week) and only clearos or openwrt supported multi-wan. I could be wrong (I'd like to be as pfSense/OPNsense are FreeBSD based, and that comes with, sadly, huge amounts of baggage, limited hardware support, etc.). I'll likely be looking at that package as a potential replacement for the pfSense system, though if clearos can't handle what I need, OPNsense is like pfSense, but with far less baggage.
If you don't mind tinkering, you might be able to use mwan3[1].
If you prefer OpenWRT, you can look at running it in a VM[2] along with mwan3.
I set up multi-WAN in a few small business contexts (on a Zyxel appliance, and on pfSense) and to be honest it was always a little flakier than I hoped. It was always failing over too quickly, or too slowly, or not reverting back to the primary WAN quickly enough when the failure/congestion cleared up. I think any sort of multi-WAN setup where you aren't actually an AS running BGP is doomed to be somewhat hacky. Of course running an AS is both a ton of work, and basically impossible for a residential customer anyway.
It is uniquely suited to run services without interruption, like file share, messengers, torrent, http server, etc. Smartphone and laptops are mobile devices with spotty connection, powered by battery.
Gateway, not router. I personally do very few and simple things on the gateway. Way less than what an average default-configured router running OpenWRT does.
No worries. Usually they are the same. The terms are probably interchangeable. The point I was trying to make is that the gateway does not necessarily have to be a pre-packaged router with a difficult-to-manage, complex configuration. It can be the computer over which the user feels she has the most control.
No, I'm in the process of moving to OPNsense. Unlike pfsense it's actually open source, not controlled by a company that tries to drive competition out with dirty tricks and they aren't trying to move their customers to a DRM solution.
How hard has it been to move to opnsense? I've been dragging my feet on doing it because my pfsense router is setup just where it needs to be but i really do want to move.
Yep. This will cruise along longer than the parent's solution, but when it breaks, you'll be starting over all of the original services from scratch plus the management system you had built once to manage them.
But it only breaks when all systems fail together; if your router fails, you can rebuild it from the gitlab job. If the VM host fails, you have time to replace it because the rest of your network still functions. If your git host fails, same thing, but where did you put those backups?
My solution is Kubernetes. Everything's configured in YAML files. The solution to all those problems is... change fields in YAML files.
Of course, you need to figure out what you need to change and why, but you'll never not need to do this, if you're rolling your own infra. K8s allows you to roll a lot more of the contextual stuff into the system.
Do you find there to be a good amount of overhead in running your own Kubernetes cluster? I'd think initial setup would be a bit of work, and then keeping the cluster updated and patched would be a good amount of work as well.
Then you've just traded maintaining one system for maintaining another.
Just started this journey myself and while there’s tons to learn, getting something up and running using k3os and flux cd takes no time at all and gets you a working cluster synced to your repo. K3s is pretty light, I know some people running it on pis.
If you use hosted Kubernetes (GKE, EKS, etc) then you don't need to deal with any of that, which is nice. You get the flexibility of Kubernetes without needing to care about any of the underlying infra
Once you learn it it's pretty straightforward. K8s has a very simple underlying architecture. It's intimidating at first, but yields to study and care.
I have also been using Kubernetes for this for years now. My favorite property is that it will run forever, no matter what happens.
The annoying part is that when I do want to do updates (i.e. updating cert-manager from 0.1.x to 1.0.x, etc) it can be a pain. So I save these large updates for once a year or so.
The solution is keeping a local mirror of all images and artifacts, and version pinning for stability (along with a periodic revision of version numbers to the latest stable version).
Oh and don't forget that now maybe you make everything work, but in two years time your setup won't be reproducible, because chances are the original images are not available any more, they got deleted from Docker Hub some months after you used them. Yeah, you should update them anyway for security... but the setup itself is not reproducible, and being forced to use the latest version of something, with the new idiosyncrasies it might bring, is not a nice situation to be in when you just want to hurry up and resolve your downtime.
So I guess that's one more thing to worry about it seems, maintaining your own images repository!
Maybe, but when the original docker image is no longer available on docker hub, chances are there will be something better and even easier to setup. And with docker you don't care about installing / uninstalling apps and figuring out where that obscure setting was hidden - all you need is just a stock distro and a bunch of docker-compose.yml files, plus some mounted directories with the actual data.
But a lot of those unofficial docker images are of unknown quality and could easily contain trojans. It's completely different from installing a package from your distro.
On the plus side, the Dockerfile and the repo with the scripts used to build a container is usually available. If you don't trust it, read through the source and rebuild it. Or just stick to official containers, no matter how terrible they are.
Even if so you're still spending say 50% of the original time investment every year or so just maintaining it. Unfortunately your options seem to be "set up once then never touch it again" or "update everything regularly and be at the mercy of everything changing and breaking at random times".
I mean, you should always have a backup of your dependencies (up to reason).
I develop mobile applications, and use SonarType's Nexus repository storage as my primary dependency resolver. Everytime I fetch a new dependency it gets cached.
A monthly script then takes care of clearing out any cached dependencies which are not listed in any tagged version of my applications.
Agree that documentation is key here. Anything you do that is beyond the vanilla "pave the install and plug it in" should be written down.
It doesn't need to be perfect - I have a onenote notebook that has the customizations that I've done to my router (static IP leases and edits to /etc/config/network), and some helper docs for a local Zabbix install in docker that I have. I recently how to migrate a database from one docker image to another and there is no way I would remember how to do that for the next time, so I wrote everything I learned down.
Just a simple copy/paste and some explanatory text is usually good enough. Anything more complex (e.g., mirroring config files in github) still (IMO) needs enough bootstrap documentation because unless you're working with it daily you're going to forget how your stuff works.
Additionally a part of my brain is worried that if I get hit by a bus my wife/kids will have a hell of a time figuring out what I did to the network. Onenote won't help them there but I haven't figured out the best way of dealing with this.
(I recognize the irony in a "I'll host it myself" post in storing stuff in onedrive with onenote but oh well)
Just to throw more products at the wall, I've been using Bookstack[0] for the same sort of documentation.
Besides being relatively lightweight and simple to setup, out-of-the-box draw.io integration is nice. Makes diagramming networks and other things dead simple. And I know "dead simple" means I'm infinitely more likely to actually do it.
I also started doing something similar via org-files, git, emacs, and Working Copy. It has worked pretty well, though Working Copy (the iOS git client) was buggier than I expected (but they have a great developer and support). My network isn't very good, or I'd just use emacs on iOS via SSH via Blink.
I work on trying to script each install. So if I need to repave, I have a documented, working script, and the source bits to work with.
I've preferred VMs for functional appliances for a while now. I like the isolation compared to containers. Though YMMV.
Right now, the hardest migration I have is my mail system, which makes use of a fairly powerful pipeline of filters in various postfix connected services. Its not fragile, but it is hard to debug.
I host it myself, as the core thesis of the article pointed out, you can be deplatformed, for any reason, with no recourse. And if you lose your mail, you are probably in a world of hurt.
The one thing I am concerned about is long term backup. I need a cold storage capability of a few 10s of TB, that won't blow up my costs too badly. Likely the best route will be a pair of servers at different DCs, running minio or similar behind a VPN that I can rsync to every now and then. Or same servers with zfs and zfs send/recv.
Thinking about this, but still not sure what to do.
I've got the same problem, I have an ubuntu fileserver I set up a few years ago, but have none of the monitoring/alerting that I'd have if I set it up for my day job. And really, I don't want to do my day job at home.
So it's a bit of a catch-22, I want a secure and stable home system, I don't want to spend much time working on it, but I want full flexibility to install and run what I want, and don't want to trust some off the shelf consumer solution that's likely going to be out of support in a couple years.
I just use a docker-compose stack (one yaml file next to a bunch of subdirectories) templated out in Ansible.
Ansible will be around for a while, but even if it's not its (yaml) syntax is incredibly easy to read. Any successor in that area is somewhat likely to have compatibility or at least a migration path.
This together reaps the benefits of Docker (enhanced through Compose), and Ansible is documentation in itself. There's barely any actual comments. The code speaks for itself. Also, I can reproduce my stack with incredible ease.
This right here. I recently lost my home server of 10 years courtesy of the Texas power issues during the winter storm. I rebuilt and started with fresh Linux install. Having a recent backup of /etc made it so much easier than it could have been. I had more trouble with the network driver on the new mb then with all my services, customizations and data.
Yikes, did you have it on UPS? How did a power issue kill your server? A big spike? Most filesystems like XFS should be able to fsck even after a power drop, especially with RAID. (I prefer RAID-10 for speed of rebuilds.)
Surge protector but the spikes were huge. My drives we're actually fine what got killed was the PSU and it was a case integrated PSU, in a system I had been meaning to upgrade for a while anyway. So this gave me the kick in the butt to finally do it. I also went to NVMe for the root drive wow what a difference.
The system that died was not out of date software wise it had been dutifully upgraded with every major Debian release so starting on a fresh install, adding packages and copying conf files from my backups, and adding my data drives really wasn't bad. Couple of days after work to do the hw build and restore.
Because of shortages / cryptomining / covid I couldn't get the HW I really wanted, that was frustrating.
Eventually, whatever platform / tool you use will need to be upgraded. Security vulnerabilities, new features, etc happen and projects like these can get abandoned within a 5-year timeframe. When you have to migrate to either a new or upgraded platform, you have to figure it all out yourself. When the config is broken by an upstream dependency, you’re on the hook too. Who knows if the build tools you used still work on current versions of things.
Like it or not, we’re all kind of stuck on these platforms we don’t control. The alternative is to become fluent in yet another technical stack, but one that will be used infrequently and won’t really translate to anything else unless you’re trying to build your own cloud service on consumer-grade hardware.
Today I've heard fellow SREs discussed whether RRDTool is the best solution for monitoring private things. Its only merit: it stopped evolving. Might or might not outweigh the decades of progress.
RRDTool is great for exactly this reason! I actually still use it to plot time series data on a raspberry pi for various projects. It hasn’t changed in at least a decade and it will run with good performance on any hardware. If it ain’t broke, don’t fix it
Long term platform stability seems to be moving from relying on the OS to higher up in the stack with the advent of Docker. It feels like Docker is being used more and more in cases where I would have considered something like CentOS.
Docker feels like the equivalent of the teenager that doesn't want to clean his room so he just pushes all his mess under the bed.
The complexity is still under the bed. We're all going to have to dig under that bed one day. Or we're just going to end up buying new hockey sticks, football pads, etc. Which is to say, we're going to end up with Linux on top of Docker on top of Linux.
I'm not sure I follow the analogy. I'm guessing the mess represents a messy filesystem/environment. What does the bed represent? A Dockerfile? Dockerfiles are way more transparent than, say, a random server's / directory.
That's just my opinion of course. It's possible to write confusing Dockerfiles but really they're mostly just shell scripts. And the idea of "Linux on top of Docker" seems a bit odd - there's only ever one Linux kernel no matter how many containers you have running. Docker is built on Linux.
It's closer to owning a car factory making disposable cars and getting a new one when it gets dirty. At some point your supplier shuts down and you cannot make a specific part anymore, which prevents you from building another car. You are now stuck with the dirty car.
If your application doesn't rely on hardware details (such as GPU acceleration or networking) that bypasses well-defined OS APIs, it's actually a very good approach.
But when you do Kubernetes or Docker, there’s almost always another stack beneath you that you have to build and maintain. Whether that’s bare metal (yikes), VMWare ($$$) or AWS/Azure (not self-hosting) you still have to deal with upgrades / API changes / hardware refresh there.
Container solutions really only get you out of “doing the work” if you can leverage a prepackaged container management solution from a cloud provider. Self-hosted containers are frequently more trouble than they’re worth since the solutions for managing them are either insanely complex (Kubernetes) or so simplistic you have to write some custom logic to build/deploy them.
Agree with the point about CentOS though; at this point the idea of a Linux “distro” is dead. The way forward is a hypervisor model where the kernel is protected from application code (including all dependencies from the userland) with barebones Docker images like Alpine used as the basis for a declaratively-defined system. The one thing Docker does very well is isolate dependencies which reduces integration complexity and lets you modularuse the whole thing. Incidentally this actually makes it harder for hobbyists to get into as the mental model for it has a lot more complexity and you need more expert knowledge to write the scripts to build and configure your app automatically.
Seems like there's a probably reasonable trend of piling some other tool on top of the stack because dealing with underlying layers is hard. Like electron apps and docker images. Or just web browsers.
Kind of worrisome to abandon lower layers with their problems and build on top of them, but what can you do, but get good at jenga.
I feel the same, however I force myself to keep doing that. Without docker.
Yes, every time a new debian release is out something I fiddled with will break and I have to remember how it works, but I see it as some sort of training. As a dev I don't want to be completely clueless about how things I use every day work, so while changes that break old configs and workflows are annoying I'm forced to learn what changed in the gnu/linux/debian world and maybe even find out why.
Also I get better at documenting things. Years ago I didn't even know what exactly to document since the moment you do something everything just naturally comes together, but after a couple times you kinda get a feeling for what the important bits will be two years down the line.
So about once a year I reserve a perfectly good weekend to upgrade, restructure or otherwise maintain my little home server running debian with things like mariadb, nginx, a filtering bridge for my lan, dnsmasq with a block list, borgbackup, syncthing, cups for printing and a couple other things I don't remember right now.
I encounter this every 6-12 months when I go back to an old project that is 'working' and want to add something/update something and it all just looks foreign to me.
The worst thing is that I have often gone through a lot of effort around making it easy to set up and deploy (docker and whatnot) but even that I have forgotten about. (I came across a docker file in an old project and couldn't get it to work properly until I noticed that there was a docker compose file lying around that I had missed)
How do you keep track of documentation? I guess for a project a README in the git's root is a good start, but what about more complex systems stuff that does not live in a git project? For example, I had to manually edit a bunch of config files on my Proxmox setup to get docker and some other things to work properly. Where would I document such manual steps? I am thinking a text file somewhere in cloud storage but then of course I'd need to remember that...
My doc is split between Notion (for bigger, structured projects) and a bunch of local md files (for general, greppable knowledge).
For my VPS, I have a Notion page where each project (name, url, mapped ports) is a row in a table. Then the project page contains a copy of my docker config and various informations I might need for maintenance/reinstallation.
I don’t try to put the documentation next to the thing being described because then I’ll lose track of it. I set up a simple Gollum wiki and put everything there. That way I get md for useful formatting, version control on the docs without having to create a zillion git repos, and I never have to wonder where the docs are.
Nix helps with this. Or at least it intends to. I still need to track down what's vulnerable and what isn't, but most of my setup is reproducible thanks to Nix.
I've read people admitting that Nix can give you a lot of work from time to time, when something isn't already available for Nix (which happens more often than, say, for Debian), and you want to add it to your system. Is that true in your experience? I may be phrasing it wrong.
I found that the nix learning curve to be steep but short. The first time you need to make something available for nix (particularly a service), you'll probably copy something someone else wrote, make a few changes and then (if it didn't work) stare at your screen for a while wondering why it didn't work[1].
It is very different which can be very off-putting, but is not usually gratuitously different, and once you get used to it, it's pretty straightforward.
After having done it a few times, I find that I can adapt a random project not already in Nixpkgs for nix in under an hour, and it's something I do maybe twice a year or so.
One counterintuitive advantage I found switching to nix from other systems is that since the 4 step "download, configure, make, make install" usually doesn't work, I take the time to make a nix expression. On Gentoo and Arch, I would often just install to /usr/local from source and then forget what I had installed and not know how to upgrade it. If you have more discipline than I do, then it's a bug not a feature, but for me it's super helpful.
1: If the project uses cmake or autotools and has no strange dependencies, then packaging it for nix is trivial. However a surprising number of packages do things like downloading dependencies from the internet at build time, and it's not always immediately obvious how to adapt that to nix. Projects using npm or pip also probably won't work right away just because the long-tail of dependencies means that there will be at least one dependency that isn't already in nixpkgs (haskell should in theory be just as bad, but the strange proclivity for haskellers to use nix means that someone has probably already done the work for you).
Just firewall off all management interfaces and allow via IP as needed. It's still possible your webserver will get become vulnerable, but you'll prob here about it here if it is.
I almost guarantee that when you need to make a change to your home server in 10 years, docker compose will no longer be able to pull any images from docker hub without being upgraded, and upon upgrading you'll find all your config is now invalid and no longer supported...
> when something finally does go belly-up, I can't remember how I set it up in the first place
Why should you need to remember it? Like you wrote later on, you just " document everything you do", as you do it. That's better than any sort of script or version control, since you can describe it how you like, which means quite succinctly. And it's not too difficult to adopt the mindframe of "What would I have needed to know 10 minutes ago to understand what to do".
> Docker has helped tremendously, since you can essentially use an out-of-the-box Linux distro with docker installed, and you don't really have to install anything else on the hardware
This actually illustrates, IMHO, how containers and docker are overused. You're talking about a single machine with a single purpose/set-of-purposes, and no occasional switching of configurations. So - why containerize? Whatever you have on Docker, just have that on your actual system.
> my advice is to keep the number of customizations to a bare minimum
Sound advice based on my own (limited) experience with self-hosted home-servers.
> and just as importantly, back up every single configuration file.
Fine, but don't rely on this too much. It's always a pain, if at all possible, to restore stuff based on the config files. Usually easier to follow your self-instructions for configuration.
I think it's important to document anything, be it at home or for business. I've done good and not so good jobs with this at times, and you feel like a hero when it's well done and you suffer through reinventing things if it's poorly done.
I agree you should have config files and backups, but like with work, I think it would be good to go through a "disaster" where you have to build a version of your config from an unconfigured environment.
A colleague had built such a config for a small office (10 people), but it depended on a specific DNS that ... wasn't set up until 1/2 way through the config. It worked because in testing they were using the network with a DNS that was already working. Small things like that are hard to catch in a working environment.
What I sometimes do is make a bash script that does all the setup. save it in the home folder. You can just copy/paste each line from the script to set things up again and you'll be able to know exactly what you did to the system later on.
> These days, I design everything for home with extreme simplicity coupled with detailed documentation on how I set things up.
Definitely a good strategy, yeah. And on the documentation front, that's exactly what I've started doing differently, too; after having to recreate my mail server one too many times, I eventually decided "you know what, I should probably write down what I'm doing", and now if things go belly-up again I at least have that starting point.
As a bonus, you can also put it online as a tutorial, which is exactly what I did: https://mail.yellowapple.us :)
I am self hosted since forever or I will rather say - since ADSL was available. The server has changed "a bit" during the last 20 years (oh, Mercury mail server :D), but last "few" years (from FreeBSD 8 :D) I am happy camper there. Just using it casually and learn a thing or two by the way. Upgrading it, migrating (2 times until now), adding disks. Nothing special.
Just got into situation where mobo has started failing and I have said it is time to reinstall, bought new hardware and throw it together. Migrated in a week, everything that was customized during ~20 years without taking notes (but with my good ol trusty diffing software) and was migrated from previous server, optimized a bit, removed unnecessary settings, upgraded postfix and dovecot to new server, replaced spamassasin with rspamd, php-fmt,...
The server was mostly operational in 2 days. Everything else was studying new software, doing things that i always wanted to but didnt want to turn around whole configuration (like stuffing everything into jails), customizing netdata, upgrading database/nextcloud/... etc. It would take longer if I would loose the data, but I trust zraid and my LTO drive, they never failed me.
Now I will sniff around it every week, maybe run some update etc. and I will be fine until some disk fails.
Being system administrator is not my occupation, but it helps if you NEVER EVER become a hostage to a cloud provider. The more you go into leisure of someone else doing everything for you - the less chances you will have to learn new things and the more you will be on mercy of someone else. And the longer you are enjoying such situation, the more technology progresses and the bigger the gap between the knowledge you have and the knowledge needed to, in this case, setup a server.
I remember dreaming in 1991, as a kid, that in 20 years everyone will have his own server at home, and how we will transfer files simply. But now everyone is saying they don't have time and buy ready made boxes or pay for the cloud.
The technology came, now it is simpler to use than ever.
But no one has "time" for that now. Or is really the time?
---
Dont take having a home server as a pain. It is a great way to learn things.
I can bake a perfect bread too. "Kicked" all my girlfriends out of kitchen. Brewing home beer. ...
But all this wouldn't happen if I would rather go to bakery. Eat in restaurants. Buying beer in a store. ... Sure I could, but I didnt.
You don’t even need Docker. Another great technology to learn is LXC and driving it from the CLI / a script.
All my hosts run a default OS with a 20 line firewall and a bridge.
The top level host has a zpool backed by a blank file in /tank.device.
The actual work is done by a bunch of LXC hosts all cloned from a standard base installation. Anything persistent goes in a per container zfs filesystem mounted in each container’s root. The only thing I ever backup is the /tank.device file.
Wiping everything and building from scratch is a pleasure.
I personally found Docker and Compose pretty simple to learn, and after years working with them I'm pretty well acquainted, but am interested to learn if there are any tangible benefits of using LXC instead of Docker?
Docker has nice image caching and fetching. It’s all about deploying and containerising in one step, usually a one line shell script alongside the container config.
LXC is more like deploying disposable OS instances and then having to deploy your app yourself, separately. It’s more bare metal but has fewer surprises in terms of supporting IPv6 and not having any inscrutable iptables magic happening on the host OS.
Docker is IKEA. LXC is hiring a joiner. (Not a value judgement.)
If you want stability I would go with a custom OS, eg. no auto updates, only update parts of the system if you need/want to. And apply security/hardening in several layers. Make backups of course. Then you could leave it running, with really nothing to worry about except hardware failure. The problem is that when the hardware do fail, it might be difficult to get a CPU with the same architecture. (like 50 years from now)
I guess this goes back to if you are DIY person or not.
For me i treat that just like I do any service in my home. I am the type that will tear about my Dryer to fix it vs buying new or bring in a repair person.
I repair my own car, appliances, etc. For me the Home server is the same.
I have been through this too way too many times. Every method I found either were as bad or added more problems (like docker, configs in git, etc.). Now I have removed almost all customisation/configuration from services I need and exchanged them with what are basically default setups of containers in proxmox. Everything is backed up and the amount of configuration needed if something did somehow go away is measured in minutes instead of hours. Luckily I don't need remote access which makes this easy.
The diagram alone is more than enough of an argument to dissuade me from giving this a shot right now - it's simply too complicated and too much to manage for the amount of time I can dedicate to it.
BUT - I'm really thankful for people who keep posting and sharing these sorts of projects; they're the ones iterating the process for the rest of us who need something a bit more turn-key.
I'm excited to see this eventually result in something like the following:
- Standard / Easy to update containerized setup.
- Out of the box multi-location syncs (e.g. home, VPS, etc.)
- Takes 5 minutes to configure/add new locations
I want this to be as easy as adding a new AP to my mesh wifi system at home: plug it in, open the app, name the AP, and click "Done".
I think do a little at a time and keep at it. Over time it adds up.
At sometime you will hit something interesting: Personal Sovereignty.
I've seen other folks hit this in weird ways.
My friend started working on cars with his buddy. They finally got to an old vehicle they took all the way apart and put it together. He had gotten to the point where he could pull the engine and put it on a stand, weld things, paint, redo the wiring harness.
I remember one day I went and looked at it and he sort of casually said, "I can do anything".
Anyway, I think the diagram says something else to me. It says he understands what his setup does enough to show it/explain it to someone else.
I had this with my bicycle at some point -- learning to fix and tweak oneself without having to go to a mechanic was eye-opening. Reminds me of the core premises in Zen and the Art of Motorcycle Maintenance.
I think the diagram gives a skewed view of how hard this actually is.
I run a very similar setup only my VPS is only a proxy for my home server and it requires very little maintenance. I run everything with docker-compose and I haven't had to work on my setup at all this year and only about 8 hours in 2020 to setup the Wireguard network to replace the ssh tunnels I was using previously for VPS -> server communications.
At the end of the day YMMV and use what you are comfortable with, but it's not as crazy undertaking as it sounds.
Somewhat OT, but never realized how expensive those cloud instances are. For comparison, I pay $4.95/month (billed annually) for a KVM VPS with 2 Ghz, 2 GB RAM, 40gb SSD, 400 GB HDD in the Netherlands. That seems a lot better for selfhosting where you probably want more raw storage than more SSD space.
If that is enough to handle everything you need, then that is definitely a better deal. The electric bill for a similar home server running 24x7 would be more than $6/mo.
I went down an almost identical path/plan, but then stopped due to corruption concerns with doing the VPS / home sync the way that I wanted without a NAS in the middle managing the thing. It’s still possible, but it explodes the complexity.
One of the big things I wanted to accomplish was low cost and easy to integrate / recover from for family in case of bus-factor.
I didn’t expect to compete with the major cloud providers on cost, but the architecture I was dreaming of just wasn’t quite feasible even though it’s tantalizingly close...basically, all the benefits of a p2p internal network with all the convenience of NextCloud and all the export-ability of “just copy all these files to a new disk or cloud provider.”
It’s so close, there’s just always some bottleneck: home upload is too slow, cold cloud storage too hard to integrate with / cache, architecture requires too much maintenance, or similar.
I think NextCloud is very close for personal use, if only there was a plug and play p2p backend datastore / cache backed by plug and play immutable cold storage that could pick up new entries from the p2p layer.
There is a cryptocurrency called siacoin. It offers cloud storage and there exist a nextcloud plugin for it to integrate it as a storage backend. I have some plans on trying this setup.
What do you think?
You are absolutely right, if you are not familiar with docker-compose, ssh tunnels, wireguard, etc... it will take more time to setup, that being said as far as maintenance go you will probably have a similar experience.
Most of my setup was done through SSH during boring classes in college so I had plenty of time to read documentation and figure out new tools.
After reading through it all, I think this is more a condemnation of the author's diagram (or at least their decision to put that particular one up-front), than of their process in general, nor the challenge.
Breakdown of (my) issues with the diagram:
- author's interaction with each device is explicitly included, adding unnecessary noise
- "partial" and "full" real-time sync are shown as separate processes, whereas there's no obvious need to differentiate them in such a high-level overview
- devices with "partial" and "full" sync (see above) are colour-coded differently; again differentiation unnecessary
- including onsite & off-site backups in the same diagram is cool but would probably be nicer living in a dedicated backup diagram for better focus
Ow wow, I didn't realise Asciiflow has started supporting extended ASCII - I've been using Asciiflow (via asciiflow.com) for years, but haven't used it for a few months, and missed this being introduced!
As much as I love the way HN's design goes against many trending "UX" conventions, I think the long-time refusal to put in very very basic simple fixes like this one is bizarre.
The messed up presentation on mobile is 100% a mobile bug, for which there is a very easy fix on the dev side, and no good workaround on the commenter side.
i've actually daydreamed about starting a computing appliance company that would make a variety of services plug and play for consumers and small businesses, from email to storage, to networking, to security, and to smart home. it's actually the direction apple is headed, but they're encumbered by the innovator's dilemma, which leaves an opportunity for an upstart. google and facebook are similarly too focused on adtech, while amazon on commerce, to lock up this market yet.
I've wanted to make something like this too. After years of iteration, my self hosted setup is now completely automated and the automation itself is super simple and organized. It would be pretty simple to setup a simple web app that allows users to simply apply the same automation steps onto their own VPSs. Hardest part would be setting up a secure process for managing user secrets to be honest.
Business wise, I'm not sure I'd be willing to pay for just the automation... in reality you don't use it very often. Could be interesting to try (re)selling tightly knit VPSs, more advanced automation features or support.
I think this solution still captures the self hosted ideology while also providing some cool value. I see people reinventing the wheel all the time while trying to automate self hosted processes... but then again maybe that's why we do it, we like the adventure!
It would be fun stuff to build, but I feel like you'd struggle to make money. Google and Amazon can afford to give away the hardware, and they can smuggle their ecosystem into your house as a thermostat or a smart speaker or a phone app, or whatever.
Like, how do you persuade the audience of enthusiasts (think: Unifi buyers) to pay for a subscription to managed software they run on their own computers, raspis, whatever? I would probably spend $10/mo on something like that, but much above that and you'd be fighting against the armchair commentary of users who won't appreciate the effort that goes into stability and will basically have a "no wireless, less space than a Nomad, lame" attitude.
Hardware sales. People will pay for the convenience of a device that works out of the box with minimal setup.
On the software side, integrate tightly with your own subscription services (offsite backups, VPS, etc) to upsell to those who want that, and win over the enthusiast crowd by making it possible to host your own alternatives to those services with a little technical know-how.
Open source most components to appeal to enthusiasts, but keep the secret sauce that makes everything seamless and easy to use "source available" so you don't unintentionally turn your core business into a commodity.
Alternatively, it's what Windows Home Server might have become had MSFT kept at it. OTOH, the fact that Microsoft abandoned it might be an indicator of how well such a thing might sell.
You clearly have the upsell part, but where is the "and win over the enthusiast crowd by making it possible to host your own alternatives to those services with a little technical know-how." part?
I and probably many others would be OK with paying for the upsell part, if it's an optional convenience, but nothing I saw on your site indicates it is, or that "own your data" is in any meaningful way true. How do I own my data if any use of it requires me running stuff on your proprietary box, subscribing to your proprietary service?
If you store data on a hard drive you purchased from Best Buy, do you own that data? It's a proprietary box also...
Data from Helm is accessible using IMAP, SMTP, CardDAV, CalDAV, WebDAV on the local network (without requiring our service). You own the device, you own the data. There is a standards-based way of accessing that data just as there is with the hard drive from Best Buy.
> If you store data on a hard drive you purchased from Best Buy, do you own that data?
I can plug the Best Buy harddrive into almost any computer/SAN I want to and utilize for the purpose I bought for without any lock in. Using the hard drive for my data requires very little trust in Best Buy's good intentions at the time of purchase and zero trust in the continued existence, technical competence or good intentions beyond that -- it is very unlikely that best buy will find a way to snoop on my data even if they wanted to. Everything will continue to work fully as intended until mechanical failure sets in. Feels like ownership to me.
In your case I rent some box from you which will lose almost all its intended function the moment your company goes bust or I stop paying you an annual fee. Furthermore it seems I am completely at the mercy of your original and continued good intentions as in addition to using your lock-in for a big price hike later you presumably can also snoop on my data. As far as I can tell there is no really substantial trust differentiator to protonmail. I have to trust them when they claim they won't read my email and are competent enough to keep things secure and I have to trust you (and continue to trust you as long as I want to use the device) that you will encrypt my data and not exfiltrate any of it (or the private key), and furthermore that you run your servers securely enough that no third party will. But what is to stop it? The box is running closed source software that you can remotely update anytime you feel like it, right? I have physical access, but since I don't control the software, what use is that?
Maybe I didn't understand something right, but so far this does not feel like ownership to me.
It sounds more like the worst of all worlds: the lock-in and lack of ownership of a proprietary cloud-based subscription service with the added hassle, inconvenience, downtime and costs of babysitting (and supplying electricity to) a cloud server for you, the provider.
> Data from Helm is accessible using IMAP, SMTP, CardDAV, CalDAV, WebDAV on the local network (without requiring our service).
In what ways is that better than running an IMAP client on my laptop and using it to send data via protonmail, using my own domain, and keeping offline copies of everything (with a periodic upload to backblaze or some other e2e encrypted backup solution)? That seems to offer about the same control over my data, but is cheaper, easier, more convenient, has higher redundancy/uptime and if anything less lock-in. It also doesn't require an additional device that has no use beyond adding an additional failure point and cost center that's my responsibility.
> There is a standards-based way of accessing that data just as there is with the hard drive from Best Buy.
Accessing the data is not enough because what you are selling me is not an overpriced and unergonomic hard drive. You are selling me the ability to send, receive and store email (and likely more).
Don't get me wrong, I kind of like the idea of buying a physical box and a subscription service to self host stuff in a way that gives me better control over my data for an acceptable amount of hassle. But that really requires some amount of openness/auditability and interoperability that currently appears to be absent.
> In your case I rent some box from you which will lose almost all its intended function the moment your company goes bust or I stop paying you an annual fee.
No - we do not rent hardware. When people buy the server from us, they own it. Full stop. There are ongoing costs to make email at home work: a static IP address with good reputation, a security gateway, traffic, etc. If people don't want to pay us for those costs, they will pay them to an ISP and/or an infrastructure provider like AWS. The ease of setup and management comes from the integration of hardware, software and service.
> I am completely at the mercy of your original and continued good intentions as in addition to using your lock-in for a big price hike later you presumably can also snoop on my data.
This is true of any paid service you use right? They can increase your costs at any time. I'm not sure why you think there's something uniquely bad about us for this reason. We have pretty clear values around wanting to know as little about our customers as possible and designing our products end to end around that. We have worked pretty hard at reducing costs, bringing the server price down 60% while doubling its specifications. Our goal is to make this as cost effective and accessible as possible for everyone. We are not interested in locking in customers - it's easy for anyone to take their data off Helm and go to a server of their own making or another service of their choosing. That's not hypothetical - like any company, we have churned customers and supported them in their migration off our product. It's easy to sling these hypotheticals you are concocting but they are not borne out of any reality.
> As far as I can tell there is no really substantial trust differentiator to protonmail.
There is actually a substantial difference. Protonmail holds your data on their servers and therefore can turn it over without a warrant. Well it's encrypted, right? So what could any entity do with that data? Well, Protonmail may be compelled to modify their service to intercept the password on login to decrypt your inbox and turn it over to a government authority (if you don't think that can happen, see what the German government did to Tutanota).
We aren't in a position to do that. Even if the US government came with a court order for your encrypted backups from us, we don't have access to the keys to decrypt them. If we were asked to make firmware changes, we would be retracing the steps of the FBI/Apple San Bernardino case and would enlist the help of the EFF, ACLU and others to fight. I personally believe the case law is pretty clear that they wouldn't win, which is partly why the FBI relented earlier.
> that you can remotely update anytime you feel like it
You make this sound like a terrible thing but really it's not. It allows us to keep our products patched and secured over time.
> In what ways is that better than running an IMAP client on my laptop and using it to send data via protonmail, using my own domain, and keeping offline copies of everything (with a periodic upload to backblaze or some other e2e encrypted backup solution)?
I didn't say people couldn't roll their own solutions. Sure they can - it's just more work, hassle and fragile. And I already covered the tradeoffs of keeping that data in the cloud. Protonmail has access to all your email in the clear (inbound and outbound). We do not and anyone running a server at home would have similar privacy. That's a clear difference.
> Accessing the data is not enough because what you are selling me is not an overpriced and unergonomic hard drive. You are selling me the ability to send, receive and store email (and likely more).
Actually it is because we were talking about data ownership. Your specific dig was about how "own your data" was in any way true ("or that "own your data" is in any meaningful way true" in your parent post).
If you want to self-host email, you need a trustworthy static IP address with reverse DNS. It's considerably more expensive to get this from an ISP. Our annual fee also includes storage for offsite backups. You don't get the same privacy assurances using Protonmail as you do with self-hosting either. For example, Protonmail is privy to the content of all outbound email messages in the clear unless you are communicating with the recipient using E2EE.
From a cost perspective, Helm V2 starts at $199 for 256GB of storage. First year costs, including subscription work out to $298. With Protonmail, their entry level plan with added storage at the same price buys you an inbox with about 28GB, a small fraction of what you would get with Helm storage-wise, not to mention we don't limit users, email addresses, domains, etc.
there are actually tons of companies in this space already making money (e.g., wyze), but it’s highly fragmented and none have a unified vision or product strategy yet. so yes, they’re vulnerable to the behemoths right now, but those dynamics aren’t locked in yet.
it’s mostly tough because of the high upfront capital costs (manufacturing, r&d, and marketing). people still talk fondly about discontinued apple routers and what nest could have been as an independent venture, for example.
Maybe I'm misunderstanding the pitch from the GP, but Wyze seems like it's pretty clearly a hardware + cloud services play, similar to most other IOT ecosystems except maybe Hue. The (optional) monthly cost paid there is for loosened restrictions on an already existing works-anywhere setup— it's an upsell for power users, not a cost of entry.
This seems a lot easier to me than on-prem cloud services, either in BYOH form ("but it's just software") or as a packaged appliance ("another hub to install, really?").
I would say that the closest thing to this right now for paid is coming from the storage side— NAS providers like Synology using hardware sales to support a limited ecosystem of "one click" deployable apps. And for free, it's ecosystems like HomeAssistant, which a lot of people just deploy as a fire-and-forget RPi image, but as expected with a free ecosystem, as soon as you get off the ultra-common use cases, you're reading source code to figure out how it works, and wading through a tangle of unmaintained "community" plugins that only do half of what you want.
the primary value-add is one layer higher than a NAS, a standalone router, or homeassistant but would likely be built on those kinds of things. it's providing a range of hardware devices that can work seemlessly together in a way that you don't have to muck around with config files or programming and yet have it all be secure and private by default. the value is in an ecosystem of safe appliances that require little technical knowledge.
home audio/theater from prior to the internet revolution might be a good analogy: a bunch of separate boxes that each provide tailored functionality but all work together seemlessly without a lot of technical knowledge. that, but for all sorts of computing devices.
> there are actually tons of companies in this space already making money (e.g., wyze), but it’s highly fragmented and none have a unified vision or product strategy yet. so yes, they’re vulnerable to the behemoths right now, but those dynamics aren’t locked in yet.
I also think it's still a little too nerd-focused for the average consumer. I'd say I know far more about security, networking and hardware than the average consumer but, compared to the HN crowd, I know next to nothing. I struggle to use a lot of the current solutions because they get bogged down in doing cool technical stuff that is so far outside the scope of the average potential user's wants/needs or the DIY solution will be "easy"... for someone with an extensive CS background and years of experience.
i think you need to go up a level of abstraction from this perspective to see the utility for the average consumer. we each have computers all around us, phones, tablets, tv's, and increasingly everything else. it's so hopeless to manage, much less understand, these mysterious machines for more and more people. what you want is a company you can trust to manage these things for you but gives you the ultimate, yet cognitively bounded, control over them.
for instance, plug in a smart device and have confidence that it's not doing surreptitious things behind your back, because it's automatically segregated into its own vlan and given only enough network access to be controlled by you without needing to know much about the underlying technologies involved.
It's not "hopeless to manage", learn some networking and be forever rewarded. Same with learning to manage devices, servers, etc. I develop now but I'd be much less valiable without that background.
"a company you can trust to manage these things for you"
"automatically segregated into its own vlan"
Aren't these goals fundamentally at odds? I would imagine that Joe consumer (if they care at all about any of this) would be rather more inclined to entrust the role of orchestrating/segregating their home network devices to an entity like Google than to some random startup.
the average person doesn't even know what these things are, which is partially why there is a market opportunity here. what they know is that companies like google and facebook are not entirely trustworthy but they have no alternative. it's hopelss, until an entity comes along and gives them some hope in the form of an alternative. basically all of the things we talk about around preserving privacy and security on the internet need to be built into our devices, and companies like google actively oppose such limitations of their reach into our lives.
You're getting into Ken Thompson's "Trusting Trust" territory here.
When you lose trust you end up with your crazy uncle leaving Fox News for Alex Jones and YouTube. You have people becoming QAnon followers.
I say this not to make a political point, but that the problem is fundamentally hopeless and I see no way out. You end up landing on one side of the fence or the other. You either just don't think about it and continue to use Google and Facebook and remain ignorant of the problem, or you spiral down the never-ending hole of despair.
We have seen articles recently that tell us not even Signal can be fully trusted. Whether or not it's true is beside the point. The point is, not even the HN crowd is safe from the cliff of paranoia. The seed of doubt has been planted.
Is someone going to trust a small tech startup in 2021? No, not like they would have in 1997. The market for trust has effectively been sealed off today. Because, paradoxically, the Googles and Facebooks ruined it all. They stripped us (all of us, not just HN) of our innocence and naivety. We know not to trust Google, but they are also a known known. A small tech company is a total unknown. We're familiar with how Google is going to bend us over. So if someone is going to do us dirty, it may as well be a known entity. Or... you go and build a cabin in the woods and start writing manifestos.
There are very mundane reasons that you may want to own the software stack your data depends on that aren't 'I don't trust Google/Facebook/Microsoft/The Reptilians.'
The most prominent of which is "What happens when they drop support for my use case/lock me out of my account?"
Unfortunately, the cost of running your own one-off solution is rather high. And doubly unfortunately, while I would pay money for a box that I could plug into my computer that provided all of these services, I wouldn't pay enough money to justify someone building it, and selling it to me.
while i agree google and facebook have certainly peed in the pool, that strikes me as overly cynical, simply through recognizing that only a fringe few actually radicalize or become helplessly paranoid in that way in practice.
most people, whether they rationalize it or not, are cognizant that we live in the grey gradient of trust for various companies and brands. the vector field is all sorts of wacky and inscrutible, but maybe we can point a few of those vectors in the right direction and some folks will happily slide down it to better (but perhaps not perfect) safety and privacy.
Having spent the past year frustratingly trying to build these types of things in AWS and spending too much money with mistakes I'd say there is a huge opportunity here. SMB or NFS as a service for example.
https://www.rsync.net/ has been selling this solution for years. Price competitive these days. Not affiliated, just looked at it recently and thought it was extremely cool.
in the late 90's there was whistle, that partnered with isp's and delivered pretty much this - router, email, web host, storage space, calendar, firewall, and easy to configure with an isp.
From what I hear, it’s all pretty nicely containerized/turnkey already. There are even several “meta apps” (Eg: Homelab OS, YUNoHost, etc) which are like the base layer on which many of these services are available as “applications” which have been pre-configured and can be trivially instantiated.
> The diagram alone is more than enough of an argument to dissuade me from giving this a shot right now - it's simply too complicated and too much to manage for the amount of time I can dedicate to it.
Yeah. I have a basic home server and I feel like even with fairly modest needs/desires (Jellyfin, Deluge, Zoneminder, some kind of file syncing, I gave up on photos because my whole family uses Google for that), it's hard to find a reasonable workflow/setup that covers it all. It was basically down to partitioning by VM (proxmox) or partitioned by container (docker), and I went with Docker + Portainer, but I'm not really happy with it; even basic functionality like redeploying a Compose configuration has sat as a feature-ask for three years [1].
Maybe I'm wanting it to be something that it just isn't, and I'd be happier with microk8s and managing the apps as Helm charts. But is that just inviting additional complexity where none is needed?
I used to have a portainer centric setup.. now I just use docker-compose directly. I have my compose split into different files with a makefile to keep things "make start" simple. Highly recommend.
These complicated setups which we see are complex because they try to save costs by using some part of the cloud. Shared VM resources in the cloud, which is all you really need, are dirt cheap compared to the really simple alternative.
Renting space in a rack at a colo facility and putting an nginx server on it is really simple, but it's also expensive compared to the complex solution in the original post.
Like has been pointed out, this can (and probably should) be done over time.
I have migrated from using cloud provided storage to Nextcloud (been running that for over 4 years now without issues), and have my calendar and contacts in there as well.
My ongoing task is to fully migrate all my images, videos and calibre library from Dropbox to other self hosted entities.
Depends on what you want - for file sync, there is definitely similar things there. As for generic homelab, I'm writing a similar article atm, and I can tell you, it will be maybe simpler to follow, but definitely not less complex. Everything depends on your needs. What's important here that ideally you only need to do this once, and then only do light maintenance.
just use k3s + (restic + velero [backup]) it's soo much easier you can basically install everything with the same tooling and update everything with the same tooling. if something breaks, bam you can just restore the whole cluster with velero (including local volumes)
The good thing (and one of the reasons I like Linux and dotfiles) is that you can start right away and keep sophisticating your setup as you go. You don't lose that configuration which is akin to knowledge.
I bought a Qnap NAS a month ago. I thought I would get it setup right away for my Linux machines, Macbooks, and network. I was wrong. But I'm slowly learning every couple days and now I have a systemd service that loads two volumes using NFS to my Linux machine.
We always have this debate at work. Do we build the system ourselves or do we purchase a product? On prem Prometheus or push everything to DataDog? I'm always a fan of building things myself because I like building things, but my company compares engineering time vs product cost.
I want what you want, yet how can society reward the work involved in creating a turnkey version of such, other than through the standard capitalist selfinterested paradigm?
Ok, I’m SUPER into self hosting, but this article? No way.
1) Duck out isn’t a thing, just stop it.
2) Half the articles cited as examples of corporate abuse were later revealed to be mistakes by the user or easily avoidable pitfalls.
3) Self hosting still requires trust (software you’re running, DNS, domains, ISP, etc...) The line of who to trust and how far is a tough one to answer, even for the informed.
How I solved it:
1) I use well vetted cloud services for things that are difficult/impossible to self host or have a low impact if lost. (Email, domains, github, etc...)
2) I self host things that are absolutely critical with cloud backups. (Files, Photos, code, notes, etc..)
I am perpetually confused about why people think that self-hosting on a VPS solves their privacy and security problems. While I'm sure there are controls in place at reputable VPS providers, it wouldn't be too difficult for them to grab absolutely anything they want. Even disk encryption doesn't save you. You're in a VM, they can watch the memory if they need to.
Using a VPS can also make you more identifiable. Your traffic isn't as easily lost in the noise. The worst thing that I know of people doing is using a VPS for VPN tunneling. While it can have its uses, privacy certainly isn't one of them. You're the only one connecting into it and the only traffic coming out of it.
So I agree with your sentiment, your details are a little off.
“it wouldn't be too difficult for them to grab absolutely anything they want. Even disk encryption doesn't save you. You're in a VM, they can watch the memory if they need to.”
It would be difficult because you’d have to have host access. VM disk encryption is now tied into an HSM or TPM these days, host access wouldn’t help. As for memory, that is now usually encrypted, so no dice there either. The security of a big name public VPS is astoundingly better than what you can do yourself.
“Using a VPS can make you more identifiable”
I think you have a problem of “threat model” here. You’re mixing up hiding against hackers, governments, etc and just lumping it under “privacy and security”
Using a VPS isn’t going to make you more identifiable to google, because you’re not using google now. Using a VPN isn’t going to make you more identifiable to your ISP, because all they can see is that you have a VPN up.
Why not use a VPS for VPN? Well you’re only right it would suck if your threat model includes governments or hostile actors, me hiding from my ISP or on a public Wi-Fi? Not a problem.
You conflate a few ideas and threat models.
Security = The ability to not have your stuff accessed or changed.
Privacy = The ability to not have your stuff seen.
Anonymity = The ability to not have your stuff linked back to you.
Threat model = Who are you protecting yourself from?
E.g. The steps I take to not get hacked by the NSA are going to be different then the steps I use to make comments on 4chan or whatever are different than the steps I take to use public Wi-Fi.
Ref: I work for Amazon AWS, my opinions are my own insane ramblings.
AMD secure memory encryption and secure encrypted virtualization. Intel probably has something in the works, but today you can take a GCE instance from a signed coreboot through bootloader and kernel with logged attestation at each phase resulting in a VM using per-VM disk encryption key (you have to provide it in the RPC that starts the machine; it's supposedly otherwise ephemeral) with SME encrypted RAM (again, ephemeral per-machine key). Google calls it Confidential VM and Secure Boot for now.
> It would be difficult because you’d have to have host access.
Which AWS has, by definition.
> VM disk encryption is now tied into an HSM or TPM these days, host access wouldn’t help.
Are you passing all of the data through the TPM? If no: you still need to keep the key in memory somewhere, the TPM is just used for offline storage. If yes: the TPM, and the communication with it, is still under AWS' control.
> As for memory, that is now usually encrypted, so no dice there either.
Still need to keep the key somewhere, so same concern as for disk encryption. Except I can pretty much guarantee you're not putting the TPM on the memory's critical path, so...
> The security of a big name public VPS is astoundingly better than what you can do yourself.
Feel free to back such claims up in the future. Because right now this seems to be as false as the rest of your post.
> Using a VPS isn’t going to make you more identifiable to google, because you’re not using google now.
What? It certainly won't make you less identifiable either.
> Using a VPN isn’t going to make you more identifiable to your ISP, because all they can see is that you have a VPN up.
Your VPN provider, on the other hand, can now see all of the traffic, where before they couldn't. So the question is ultimately whether you trust your ISP or VPN provider more.
> Why not use a VPS for VPN? Well you’re only right it would suck if your threat model includes governments or hostile actors, me hiding from my ISP
Sure, if you trust the Amazon over your ISP that makes perfect sense. Then again, this is the Amazon that seems to love forcing their employees to piss in bottles, and is on a huge misinformation campaign against treating their employees properly.
That seems like an upstanding place with great leadership.
> or on a public Wi-Fi? Not a problem.
Makes some sense, but it wouldn't really give you much more than hosting the VPN at home. (Well, you'd still have to do the same calculus here for home ISP vs Amazon.)
> You conflate a few ideas and threat models.
Pot, meet kettle.
> Ref: I work for Amazon AWS, my opinions are my own insane ramblings.
Good to know that AWS employees are either clueless about their own offerings, or deliberately spreading misinformation.
Again, with the insults.
I comment for your benefit, not mine. I already know the right answers here, it is you who are mistaken.
So you can either consider an alternative experience set which undoubtedly differs from yours, or not.
I don’t care either way.
VPS doesn't solve privacy and security, it solves getting locked out of your account because some algorithm decided you were peddling child porn.
If you want privacy and security and you don't trust your provider, then you have to build your own hardware and compile everything you run on it from vetted source, including your kernel. You can do it, but most people decide that on balance its better to trust someone.
VPS doesn't solve privacy and security, it solves getting locked out of your account
Does it really? It just seems like instead of trusting a big company that everyone knows, you trust a smaller company that not everyone knows that involves more work for you.
I'm pretty sure I've seen articles on HN where VPS companies (maybe DO?) have kicked people off their infrastructure with zero notice. So, not at all different from being locked out of Apple/Google/Amazon.
Perhaps, depends on your needs, risk model, pricing etc...
I use DO but couldn’t use AWS due to price for example.
Hoping platforms whenever you catch the ire of the gods is a bad CX for this problem space.
As you can see, AWS is far from the only game in town. If you can't find two or three from that list that will meet your needs then perhaps you should reassess your quality metric.
(I note in passing that my preferred provider, Linode, is not even on that list.)
DO = Digital Ocean
Again, depending on your needs, VPS are not a commodity. GCP offers a few things that Azure or AWS don’t, etc...
Often times making sweeping generalizations without deep industry knowledge is a bad idea.
If you don’t even know what DO is, why do you feel experienced enough to argue?
Because I've been self-hosting my internet services on VPS's for the last 15 years on various providers. There are literally dozens of them. They are absolutely a commodity.
"Cloud" and "VPS" are not the same thing. VPS = Virtual Private Server, which is one very specific kind of cloud service. Cloud services in general are not a commodity, but VPS is.
Howso? The VPS can shut you down as well? You might say the migration path is easier, but there will be a weak link somewhere. Even if you put up a datacenter in the basement you need to connect to the internet somehow which can be taken away.
well DO decided to lock me out of my account that I had for years because they decided that I'm a fraud and had to deal with their terrible customer service
With rclone you can encrypt data locally while uploading. This allows you to host everything from home and use the cloud only for backups, basically end-to-end encrypted.
A setup that probably works is vps -> tor -> vpn or some other order of these three, but I couldn't find any sort of blog that detailed setting up something like this so I imagine very few people are doing it.
I always think of it as – how many examples of "I got locked out of all my data!" would there be if billions of people start following the author's advice? Definitely more than the ~5 they list (whether that is user error or actually Apple/Google/Amazon's fault).
the 'duck it out' thing really made me cringe. we really need to get away from that idea of having a searching verb that is tied to the popular search engines of the day. i use duckduckgo but it might not be around in 10 or 20 years or there might be something better by then so its pointless to expect everyone to keep learning new verbs all the time.
Thank you all so much for your comments. I didn't expect this will be this high on HN. I'm aware there are more simple solutions for self-hosting, even partially. I'm also aware that my setup is not perfect - that's why this post was created. I was hoping to get some feedback. Not from that many of you, but some friends. :) Ask me anything you like, I'll try to answer every question.
You're system architecture is very clean and understandable. I spend a lot of time marveling at the beautiful but often overly complex diagrams on r/homelabs, which more often than not dissuade me from actually having a go at it. Your explanation made it feel very approachable.
That being said...
> Some people think I’m weird because I’m using a personal CRM.
This strikes me as incredibly...German, hahaha! Is there any reason your Contacts solution doesn't/can't provide this functionality?
Heh, I'm living and working in Germany, but I'm not German, still (or yet). :)
Regarding CRM and Contacts - I could possibly fit all the info in the 'about' field for a particular contact, but Monica offers me so much more. With Monica, I can structure the data for a contact in a better way. That 'better way' and the feature set of Monica is why I'm using it.
I mean, I'm sold. I guess the biggest question is could your Contacts be pulled from Monica so that things like messages and phone apps pull that info?
The article sounds like you enjoyed building the system you put together, and I think that's probably a seriously undervalued aspect of why someone might take on this kind of work.
Thanks. It is kind of a show-off of what I built for myself. That's why I put that little disclaimer into the post, that it's not for everyone. I do have strong opinions about a lot of the things regarding where I hold my data, but I don't want to strong-arm anyone in doing the same thing.
>Ask me anything you like, I'll try to answer every question.
What's stopping you from hosting at home?
While I admit that I often feel claustrophobic with only ~35-40 Mbps of usable bandwidth, my power costs for several orders of magnitude more usable storage+cpu are in line with what you're paying for VPS right now.
>I was hoping to get some feedback.
Do you run any additional layers of security of top of NextCloud? From something simple like requiring SNI to ward off casual scanning activity, or more advanced like a WAF layer?
I ask because I've been hesitant to trust my whole digital life to something that doesn't have a full-time paid security staff.
"for purely private use, I wouldn’t opt for AWS even if I had to choose now. I’ll leave it at that"
I will elaborate: I started out with AWS several years ago. I could never work out how they calculated my bill, and had more than one >$100 shocks for hosting my personal services.
I moved to DO and Vultr (stayed with DO for no real reason) and so shut everything down on AWS.
But I still got a $0.50 monthly charge on my credit card. I tried emailing - no response, totally ghosted.
I went through the control panel several times - it is/was a huge mess, obscure by policy obviously - and finally in some far distant corner found something still turned on. I did not understand what it was at the time and can recall no details, but I turned it off with great relief.
A week later I got a email from AWS (!) saying that I had made a error and they had helpfully turned the whatever it was back on...
So I continued to donate $0.50 a month to Amazon until I cancelled the credit card for other reasons. (it would cost $10 for the bank to even think about blocking them)
These days I will crawl over cut glass not to do business with that organised bunch of thieves called Amazon.
This inspired me to finally track down the $0.XX monthly donation I’ve been making to AWS. Through the billing dashboard [1] I discovered a zombie static site I set up ages ago with S3 and Route 53.
> I went through the control panel several times - it is/was a huge mess, obscure by policy obviously - and finally in some far distant corner found something still turned on. I did not understand what it was at the time and can recall no details, but I turned it off with great relief.
Using IAC (Terraform) would solve this in an instant: "terraform destroy". Done.
I am in the same boat, I'm not personnally using AWS anymore but i'm still charged x.1x$ a month. It's not worth it enough to track the charge down and I might just delete my account without forgetting to change my email adress beforehand (since you can't reuse a deleted account email).
Y'know what, Although I'm currently self hosting my email, my websites, my storage, my SQL, my Active Directory etc., I'm also in the process of migrating the whole lot to Azure and/or independent hosting.
Why? It's just too much hassle these days; I want my down-time to be no longer dictated by my infrastructure. I don't want to have to spend off-work hours making sure my boxes are patched, my disks are raided, my offsite-backups are scheduled, and my web/email services are running. I just want it all to work, and when it doesn't, I want to be able to complain to someone else and make it their problem to fix it.
For my data, I'll probably still have an on-site backup, but everything else can just live in the cloud, and I'll start sleeping better, due to less stress about keeping it all secure and running.
I stopped self-hosting as soon as I moved out of university. Back in university I had a gigabit uplink and only 1 power outage in 7 years of my PhD. Now in the middle of Silicon Valley I have only 15-20 mbps and have had 3 power outages in 1 year.
Nope. I'm on a static business IP, with DNS all set up correctly. I've also got SPF records set up, but I don't think they get used, as I use my ISPs smarthost for relaying mail through.
I do get a lot of incoming spam though, but I think that's more to do with some of my email addresses being over 20 years old.
Not OP, but hosting my own mail as well (postfix, dovecot, spamassassin) for six seven now.
Had one issue with outgoing mails to Microsoft (hotmail I think) bouncing. The IP of my dedicated server had been blacklisted from before it was used by me, but I got them to remove it. No other issues I can think of.
I'm getting about 1-2 spam mails a month delivered to my inbox, usually french SEO spam. Not worth investigating.
The author of this post cites $55/month as his cost. This is wrong. If it takes him, say, two hours a month to maintain (probably conservative) then if you value those hours at $100/hour the actual cost is $255/month.
The reality is probably in excess of $1000/month. This only makes sense for people who have an abundance of spare time, and that's pretty rare these days.
Free software for DIY hosting like this is "free as in piano." Like a huge piano sitting on the street with a sign that says "free piano," it is actually not free at all when you factor in the hidden costs.
Well that is only if you look at software/ops as a 100% commercial undertaking. It is not.
One way to understand why people self-host is to understand why people self-cook their food. It takes significantly longer to prepare food (get raw material, cut, cook) than ordering it. People still do it for $reasons - some find it fun, some find it cheaper, some find it nice to be able to control the taste, some find it more healthy to know whats going on their plate, and so on.
Only concentrating on the dollar cost is too narrow a view, IMO.
Well that is only if you look at software/ops as a 100% commercial undertaking. It is not.
Your time is only free if it is worth nothing. My time is very valuable. I happily pay other people and companies to do things for me because I'd rather have the time.
I think it's just a normal part of life. When you're young, you have more time than money. When you're old, you have more money than time.
Far fewer people cook their own food for fun versus preparation time not being the only constraint: cost, availability, health, transparency (of prep & ingredients), dependency etc
It's common for people to delude themselves into thinking they haven't wasted their time by convincing themselves they did it for fun (or the lols, or whatever) - I'd say the difference is whether they knew )or stated) this upfront, or only after they failed, or had a better solution pointed out to them.
2nd most common also: at least I learnt something / gained xp - which is fair enough, if true.
> Only concentrating on the dollar cost is too narrow a view
Not if you convey other resources/constraints in dollars. Just attach a dollar-value to your free time, perhaps with discounts for things with side-benefits.
Fair enough, by "far fewer" you basically mean the same thing I meant by "some". The point was not to have an exhaustive list of the exact reasons behind people cooking.
> It's common for people to delude themselves into thinking they haven't wasted their time by convincing themselves they did it for fun
I am probably missing some context here because this does not make sense to me. Something is fun because its fun, what does it even mean for someone to forcibly convince themselves of something that is otherwise? ¯\_(ツ)_/¯
> Just attach a dollar-value to your free time
I do that when someone asks me to do a project for them in my free time, so I can know what to charge them. But there is little value in assigning a dollar-value for time that I am going to spend doing something that _I want to do_ . Its like watching a movie, or making a sand-castle in the backyard. I won't enjoy it if I keep thinking "Damn, I just watched a movie for 3 hours, there goes $300 worth of time."
I mean the minority, which I'm not sure "some" implies. The factors you describe: cheaper, control, health can be seen as the same as cost. More expensive places might taste better, be healthier etc. While I'm sure a lot of people enjoy cooking, I'm not sure many would recreationally cook as often as they need food.
> context
You can have fun imagining the payoffs, only to find they do not appear. Have you ever seen a movie and been disappointed, or played a game and found it lacking. "fun" is not an absolute measure, and review doesn't necessarily capture how fun something is versus how fun you think it ought to be - plenty people give things higher value than their "fun" value, despite claiming to only have done it for fun - the missing value is ideological.
> there is little value in assigning a dollar-value for time that I am going to spend doing something that _I want to do_
Only if there is, some some reason, literally only one thing you want to do. But if there are competing things you might like to do then comparing them makes sense. One way or another, if you choose to watch a movie, or build a sandcastle, you are comparing the two to decide which. Using monetary values is just a more formal way of doing that for larger, less impulsive, projects.
As a developer, all of the time I spend working on hobby projects (and self-hosting has turned into a hobby) keep me up to date. It's how I learned Kubernetes, it's how I learned Traefik, nginx, and apache before that. It's how I learned how the different packaging and distribution ecosystems work for many different languages and frameworks. I intentionally host and backup some things on AWS, GCloud, and Azure. Other things live on Intel NUCs. I administer a GSuite for the family. The list goes on and on. It gives you the chance to experiment with new tools and toys that you're unlikely to use at your current job.
My long-winded point is that all of the things I've picked up have been invaluable to me at work, especially in my time as a contractor where I would be switching between many different stacks. If you want to find a "true" cost for self-hosting, you need to also treat it as training.
I don't really believe it's any different from say, a woodworker that has a shop at home. They may spend the workday just doing framing, but odds are good they find the time to make a chair, a bird house, something to keep their skills sharp.
As a developer, all of the time I spend working on hobby projects (and self-hosting has turned into a hobby) keep me up to date.
True for some things, like things that are not at all related to your work. But your job should be actively trying to make you better at your job, and a better person.
Large companies like the one I work for hire outside firms to offer classes to the employees for free, and on company time. If there is a new version of a piece of software that is significantly different from an old one, my company pays for the users to go to training, or to train online. This is very common for products like Office or the Adobe suite. But for some reason, as developers, we too often think that we're supposed to better ourselves on our own dime. If it benefits your current employer, the current employer should chip in.
I used to think this too, which is why I was self-hosting (I'm the OP of this thread), but as I've got older, and my interests have shifted, along with no longer needing to be at the bleeding edge of my skill-set (I leave that stuff to the younglings these days), I found that managing my own infrastructure felt more like a chore than a hobby, more so if it's a 'production' system and not a 'lab' environment.
"Free as in free puppy" is my other favorite metaphor. Free software is a gift to the word, but IMO it's important not to undervalue the time and expertise of operationalizing it.
It's worth remembering that you can get an expensive puppy too. I.e. choosing proprietary software doesn't mean that time and expertise won't be required.
Very fair point, since "free puppy" misses the "free as in freedom" aspect. I think it works better than than "free as in beer", which captures part of the CapEx dimension but none of the OpEx dimension.
I would argue yes and no here. If those are two hrs where you are not employed making $100, then its $55. If you have to give up 2 hrs of employed time to maintain this, then yes $255.
I love my free time and there is precious little. But I don't think of it as costing ME $100/hr when I wash, dry, and detail my car, especially as I like doing it.
It’s for people with enough spare time. If you find yourself having to cut your work hours to the tune of $1000/mo (or more realistically, personal time that you value) to self-host some stuff compared to the time it would take to maintain non-self-hosted equivalents, then by all means, don’t – but that’s definitely not how “actual cost” works otherwise.
I second this, either get a Synology/Qnap NAS or take an old PC with a couple drives and install OpenMediaVault/Freenas/Unraid. All of these platforms have out-of-the-box solutions that mirror most cloud services. I found homelab redit to be great.
If you get the off-the shelf NAS, get one with at least 2GB of ram! Synology is particularly notorious for selling NAS with 512MB(WTF?!) of ram, and then when you try to run a few applications it grinds to a halt.
NAS fails for smartphone integration. Photos should auto upload. Calendar, todos, and contacts need to show up in the usual apps. It needs to be available from remote.
Synology NASes have various apps for syncing mobile devices, such as DS Photo for uploading photos to Photo Station, Synology Drive for more of a Dropbox approach, and MailPlus for contacts, emails, etcs.
For some of us, we turn it into a hobby. Only difference is that the technical knowledge and experience gained at work, can also be applied at home. (without a lot of restrictions).
What I meant is that I need some time to _not_ be productive. Like, actively not being productive. Literally wasting time for the sake of getting some peace of mind and true relaxation.
If your personal life is filled with productivity tools and optimizations, at what time in your daily life your are _not_ worried about productivity? If this time is zero, I think it's kind of sad and maybe even unhealthy. It's just my opinion, of course :)
I agree with you on the importance of non-productive time but I've found having my own infrastructure makes my life smoother day to day in exchange for some upfront cost. It's a tricky balance, and as many other commenters have mentioned that initial cost can end up not being so initial - though I think most people who engage in this 'hobby' generally find both the process and the product rewarding.
Funny headline, because every time I try to self-host anything important like mail, I learn how deep that field is and how little I know and that I'll probably need many many hours to do everything right and in a secure way (and my mails would still have a higher probability to be classified as spam). Then I think: "Screw it, I'll just use GMail"
Email is basically pointless to host yourself from a privacy perspective. Every email has one or more people on the other end that also get a copy. Privacy and email are mutually exclusive. That said, the alternative doesn't have to be something like gmail, where they can do whatever they want with your data. I use Fastmail and that's "sufficiently private" for my needs.
I use Fastmail + a custom domain. Because fastmail is the provider, I am not on any spam lists. My emails make it through. My service is very reliable, so I always get my emails. If fastmail decides to hate me, I can just point my domain MX records somewhere else.
The Apple example was proven to be inaccurate. The loss of access to email wasn't due to him not paying his Apple Card; it was for not paying for his iCloud services. In other words, his shutoff could have happened if he had his iCloud account set up to charge to any other credit card.
I use gmail too but I also have an email from my cheap hosting as a backup. The big problem with gmail is not that Google is reading my email but that this giants can lock you out without a reason and without a right to appeal.
I am super salty that Sony banned my PlayStation account(used by my son) for 2 months (I have a Plus subscription paid for 1 year too) without no way for me to see the exact reason (was it a text message, or a screenshot that was shared, or just a report from an troll) and no way to contest this. I made my decision and fuck consoles my son will have to learn to use a medium spec PC for gaming.
>The big problem with gmail is not that Google is reading my email but that this giants can lock you out without a reason and without a right to appeal.
I also worry about that. What I do is sync gmail with outlook.com. Now I worry that I just doubled my risk of falling victim to a security breach :-)
Regarding email, I spent some tens of hours setting it up, including implementing DKIM, DMARC, SPF and getting my mail delivered to Gmail and O365. That was over a year ago and things mostly just work with the occasional upgrade or configuration change. You could also save a lot of time by going with a pre-packaged solution. I understand if you don't have time for that, but at least in my experience, self-hosting email isn't the impossible task it's sometimes made out to be.
People’s experience with deliverability of messages from self-hosted mail servers seems to be very hit-or-miss but I’m another one of the lucky ones. Rather than using Mail-in-a-box or something similar, I used a cautions step-by-step approach.
About 5 years ago, I read the O’Reilly book, Postfix: The Definitive Guide that had been sitting in my book-shelf for years. I installed and configured Postfix as a sending-only mail server on a Hetzner VPS. I sent a few test emails to GMail accounts and a friend’s Office 365 and they both worked! I then gradually added extra layers of functionality (TLS, DKIM, SPF, DMARC).
Once I was happy that I could successfully send emails, the next step was to receive email: I added MX records for my domain and opened port 25 on the firewall. I was able to use Mutt over SSH to read emails sent to my account. I later installed Dovecot (excellent documentation) and Squirrelmail (lacking in features but was easy to install). I don’t really use web-mail but I’ll probably install Roundcube at some stage and I plan to learn how to use Sieve for automatic filtering.
I thought I’d have serious problems with spam and have to install anti-spam software and/or use black-lists but that hasn’t (yet) been an issue. Simply using Postfix default options along with grey-listing and not accepting messages from invalid (according to SPF records) sources blocks all spam. The only times I received spam was when I had accidentally disabled the grey-listing (the mail logs show I get hundreds of connection attempts with only a tenth of successful connections being genuine). The system actually works better than GMail in that I don’t miss messages that were wrongly flagged as Spam. Another benefit of self-hosting is that I can quickly and easily set up account-specific email addresses, e.g., <hackernews@example.com> – no need for <anthony+hackernews@example.com>
I gradually started using it instead of GMail and it’s now my primary email account for important communication. In the four years of serious use, I haven’t had any problems (touch wood).
Once you start using Sieve, you can also start doing more fine-grained filtering of mail to locations and archiving things which you care about with a couple of bash scripts and systemd timers/cron jobs.
My experience has been similar. The most important thing that can't be easily resolved is if the server's IP is already on the blocklists, but other than that DKIM, DMARC, SPF, and reverse resolution together solves the ~~magical ritual~~ heuristics that make the big guys happy.
You could also look for alternative providers. Posteo for example promises to run on FOSS code, and is endorsed by FSF. Definitely a middle ground between an artisanal email setup and gmail.
Interestingly, it doesn't look like the author is self-hosting email. I know mail-in-a-box exists, but even with that I find it's worth the peace of mind to pay someone else for mail hosting.
I ran my own email server on my own domain for years. You're right, it's kind of its own special nightmare to integrate all the parts of a comprehensive email system (I eventually started using Zimbra's free server software because of it), but spam effectively killed being able to self-host more. Even if you setup SPF and DKIM and all the rest, you'll find yourself getting blackholed anyway because you're NOT Google or Apple or Microsoft. It's not like I even sent email in bulk either. It was just my normal, personal account. But getting OFF blackhole lists became enough work that I had to route my mail through Gmail anyway, so I just gave up self-hosting entirely. That was, like, 7 or 8 years ago, though, so maybe things are different now, but I doubt it. I expect it to only have gotten worse.
Same here. Over the past 20 years (25?) I've tried numerous times to self-host email and the issues I had with spam and blacklisting were too involved for me to resolve, and only got deeper. Definitely not something to attempt casually.
No need to self host email, just use your own domain.
That way if a provider gives you lip, you move on. It’s the email address itself that is valuable, the emails can be backed up.
No problem if you pull your complete mailboxes using a tool like isync/mbsync. I have background jobs, in addition to regular backups, which pull to all my powered on computers every 30m. As long as the source is IMAP, it's very easy to not get screwed. I couldn't care less if my email hoster today would lock me out. I'll point my domain elsewhere and I have all the data as maildirs.
You don't really own any of your cloud data, even if it feels like it. If you want to own your data then it needs to reside on private computers in private spaces - though that does not preclude you from sharing but you lose control of what you share.
This is a key reason for why I started Helm - thehelm.com. There's a lot of talk here about the hassle of self-hosting and while many HN folks are perfectly capable of running their own servers/services, it can be very time consuming. We take away the hassle and provide the benefits of self-hosting at home.
I don't self-host everything, for me it's enough if I can take out all my data. I have all my cloud hosted things mirror via rclone to local storage. So I'll gladly use git, IMAP, CalDav, CardDAV as a service but I'll have my local hot mirror and cold backups ready any time. Tools are readily available:
* git is git and clones
* IMAP gets pulled via mbsync
* CalDAV, CardDAV get pulled via akonadi and exported to flat files from there
* Remote SSH/SFTP accessible storage gets pulled using rsync
* Other remotes get pulled using rclone
I always wonder why people don't trust their offsite back-ups to cloud providers. I know they're trying to get away from getting locked out of their data, but what are the odds a burglar steals their computers on the exact same day their cloud provider locks them out because they violated the 'no making fun of ridiculous cloud provider lockout policies' policy?
As long as your house burning down and your cloud getting locked don't occur on the same day, you're golden and thus no messing with blue-rays and bank security boxes.
> Every last weekend of the month, I will manually backup all the data to Blu-ray discs. Not once, but twice. One copy goes to a safe storage space at home and the other one ends up at a completely different location.
This is one paragraph after mentioning a 2TB+2TB NAS. Even assuming that's RAID1, a standard Blu-ray only stores 50 GB, so you need 40 of those. And then you need another 40 for the other location... every month?? Honestly it's probably cheaper to buy a new 4 TB hard drive every month.
If you're a cheapskate like me, backing up your encrypted (e.g. Borg) backups to a cloud provider like Google Drive isn't a bad option. My org provides me with unlimited cloud space, so I have hundreds of encrypted gigs on Google Drive. No reason to think it'll disappear overnight.
> it's probably cheaper to buy a new 4 TB hard drive every month
The reason for blurays might be that he's following rule 2 of the 3-2-1 Backup Strategy. 3 backups, 2 different types of storage media, 1 copy off-site.
If your house is hit by lightning it could wipe all your magnetic and solid state drives, but optical discs would probably be ok.
Most definitions of the 3-2-1 rule count the original data as one of the 3 copies [1][2][3] and don't go as far as to specify that you should be diversified against literal medium type. (Most recommend an external drive or NAS for your second local copy)
The "my house was hit by lightning" case is covered pretty well by your 1 offsite backup.
> I always wonder why people don't trust their offsite back-ups to cloud providers.
Oh I trust them not to delete my encrypted data and that's all that I'd ask. But I still don't have backups on hardware that I don't own: it's super expensive. From the cost of hosted backups, I could buy a hard drive of the same storage space every three months or something. And hard drives live longer than three months.
Putting a raspberry pi with a big hard drive at a friend's place is rather power efficient, dead simple to setup if you have any technical knowledge at all, and off-site. Use Restic or something to avoid having to trust your friend or whoever will burglarize the place. They have an Internet connection anyway and where I live, there are no bandwidth caps.
Use borg or other encrypted backup tool (Restic is also recommended by others in this thread). rsync.net is my offsite backup. Doesn't matter what anyone demands of them, all they ever see is an encrypted blob.
You do then need to find somewhere to store an offsite backup of your encryption keys. That said, since those change far less often than your backups, options like a safety deposit box are a more realistic place to store keys than the backups themselves.
Yes, borg also has the option of storing them in the repo itself, protected by a passphrase (think encrypted ssh key files).
Anyway, my "home burned down rescue bundle" consists of a flash drive with a keepass export of my password vault and encrypted borg repo key / rsync.net ssh keys at the office.
Slightly less accessible in this pandemic world, but no safety deposit boxes needed.
Important to reiterate this, since it's an interesting feature - borg does, indeed, allow you to store the encryption key for your repository inside the repo itself and you just need to remember/safeguard the passphrase.
I can't say whether this is a good choice for any particular use-case but I appreciate that it is an option ...
For those that are curious, yes, you can later export the key from your repo so that you have a copy elsewhere ...
Yeah I tend to trust B2 for my offsite. I have redundant storage locally with snapshots. That covers up to two disks failing or even someone trying to wipe storage over SMB.
The offsite protects against catastrophic failure or theft. I would still like to add in another backup for critical data such as databases, I may use another cloud provider that has geo redundant storage for those.
I had the same thought as the title of the article go through my head, but we ended up with a simpler setup as I wanted something I don't have to constantly mess with:
* Put together an overbuilt NAS box running ZFS On Linux
* Simple docker-compose file for all services
* Backups through borgmatic (via ZFS snapshots)
* Auto-updates through watchtower
* Punted on email and use FastMail, switched to our own domain from gmail
Services we run include:
* PhotoPrism for semi-Google Photos functionality
* Nextcloud and Collabora for file sync, sharing
* Kodi for home media
* Tiddlywiki
* DDNS through Gandi since we're on a dynamic IP
* PiHole for some ad/privacy protection
* Robocert for SSL
* Nginx to reverse proxy everything
It wasn't _easy_ to set up, but in a year, any given week I typically spend 0 hours dealing with it. No problem that _has_ cropped up has taken more than a few minutes to fix, mostly around docker networking and auto-restarting containers after Watchtower auto-updates them, a problem I've since fixed.
This setup seems way easier than k3s or some other recommendations, doesn't require much new knowledge, and is as portable as I need it to be. If needed I could plop the docker-compose on a new machine, change some mount points, and largely be up and running again quickly. It's let us switch to "deGoogled" phones and unplug from almost every hosted service we used to use.
> I’m living in Germany, so the obvious choice was to spin up my instances in Vultr‘s* data center in Frankfurt, as ping is the lowest to that center for me.
The author is probably aware of this, but just in case they aren't: Hetzner is an amazing company with two or three datacenters in Germany. I don't remember if any of them are in Frankfurt, but given they offer VPSs and beefy dedicated machines, I'd be fine trading a couple milliseconds for this flexibility (and overall better pricing, even if Vultr's isn't that expensive as well).
I'm running a few bare-metal Hetzner servers (Falkenstein and Helsinki). Can recommend. Reliable, comparatively cheap and tickets are usually responded to within 24 hours. One time I even got through to the guy swapping defective hard drives on one of the servers in the data center by phone, since there was some other issue.
I would say that hosting in a German datacenter owned by a German company is the worst way to get complete exposure to the ham-handed and completely out-of-control German intelligence apparatus. Maybe being a German citizen protects you slightly from the BND, in the same way that the NSA technically doesn't spy on Americans, but I doubt it.
I don't know what qualifies as "amazing company" in your eyes (they're certainly cheap for what you get), but my experience was certainly very bad:
I rented a VPS from them experimentally for a month, then left to go on vacation thinking I had cancelled it, but I had not. They left it running for another month that I hadn't paid for, and then sent the bill for the extra month to collections, so that's presumably affecting my credit score now.
Sending a bill for a recurring service to collections rather than just canceling the service is trash-tier company behavior, IMO. I strongly recommend against using Hetzner.
I can see why that's a bad experience for you, but IMO Hetzner did the right thing here.
What if I forget to pay my AWS bill for a week and, because of that, all my resources get deleted? IIRC AWS will inactivate your account after 3 months without payment, so the same thing would've happened there. 3 months is a nice trade off between "forgot to pay my bills" and "no longer use the service".
ETA: for sure it sucks they sent the bill to collection. Uncalled for, they could've attempted to settle directly with you first.
Agreed. Hetzner is very strict about not leaving a penny of theirs wasted/delayed without compensation. I know a couple friends here in Turkey who were contacted by a local collection agency for late settlement and were brought up with legal proceeding if they're not settling it soon. Aware of this, I started to never ever /forget/ about paying any of my bills on time.
I'm curious as to for how long you've been using this setup specifically in regard to Nextcloud, and how many and what volume of files you store in it?
I've set up a few Nextcloud instances in the last 2 years on Digital Ocean VPSs and Raspberry Pis and I ran into so many problems and difficulties which scaled with the quantity and size of files I hosted on it. I took care in setting up everything to a relatively solid standard (memcache etc.), but I found Nextcloud to be so unreliable for syncing particularly with the official Android and Linux clients. Plus, there was the whole botched version 20 upgrade.
I find Nextcloud tries to solve too many problems turning it into a bloated mess even for a moderately experienced user.
For file storage only, I've found Syncthing on a Raspberry Pi at home syncing over Zerotier (for when I'm not at home) to be a much more robust, user-friendly and scalable solution, despite it syncing whole folders only.
Raspberry Pis are toys, don't use them for anything important.
I've been running NextCloud on an Intel NUC i7 in a Linux container so I can easily snapshot and backup. Recovery is easy in the event of a hardware failure as replacement NUCs are available off the shelf. All I need to do is swap the HD and RAM and I am back in business.
i have all of that with a pi 4 with an usb 3.0 2tb disk connected..
Raspberry pis are so cheap that you can keep a spare one around just in case the hardware fail.. although i have 3 hosting different stuff in my network and does not fell the need to have a spare one.
i can get a new one by tomorrow cheaper then any used NUC out there and all i have to do is remove the sd card from one and put in the other, plug the usb disk and i am good to go.
my home servers are all pis, with the older turning 8y already, and have not had a single problem with them all this time.. likely my disk will die much sooner then the pi.
Indeed, I don't really consider personal projects, and niche pdfs I want to sync between devices that important. Also, I didn't want to spend ~£400 on something like an Intel NUC
When I was picking a self-hosted Dropbox alternative, I ended up going with Seafile rather than Nextcloud since it was a lot more focused (just file sync) and people said it was much faster. It's got a few rough edges but the core functionality has been rock solid. Granted, I've only had about 100 GB max stored in it.
I didn’t realise Seafile Pro was free for up to 3 users. I had previously looked but gave up as no self hosted solution was really comparable to OneDrive or Dropbox.
I've been using Seafile for several years, and highly recommend it.
For the past couple of years, I've had it running in Docker on a cheap Azure VM, backed by blob storage. I've got around 2TB in there, and it all works marvellously.
We've been using Nextcloud in my home for the better part of a year now, almost completely problem free. I even have auto-updates via watchtower. We have 136 GB of data on it (just checked now). Not sure where that lies compared to your data. It is running on a fairly beefy box though, not a rPi. Only issues so far have been needing to set up cron, which took about 5 minutes doing it the "easy" way (host runs a docker command in it's crontab). Collabora was super annoying to set up, but that was a one-time cost.
Interesting. My volumes were similar and I even had issues with my 'beefy' enough DO VPSs. The primary issues for me were with the clients, especially if I, say, moved a folder of 2000 files from one directory to somewhere else within the Nextcloud drive using the UI. Anyway, I'm not here to troubleshoot that - I've long since decided that it's just too much for my personal simple use case of keeping two folders in sync with each other on different devices. Out of curiosity, how did you install Nextcloud? Snap/Docker/Manual?
Ah, during our migration we did try to move thousands of files from a "Dropbox" folder to a "NextCloud" folder, and indeed the Windows client was not happy. Since it was a one-time thing, the solution was to move the files "manually" over SSH and just run the NextCloud "scan" utility to pick up the changes on disk.
I'm running NextCloud via the official Docker image, reverse proxied through nginx.
My good old friend, the Nextcloud scan utility :) I lost count of the number of times I ran that and the trashbin cleanup. These are both problems I never ever want to have to deal with.
eh, I ran the command, alt-tabbed to something more interesting, and checked later in the day to see that it was done. Never had an issue running it, and only ever needed to when I was doing the initial data migration.
Syncthing is great! Are you using zerotier for transport security. Or does it improve speed as well? Off-site syncs are far too slow for me. Luckily, it's very rare that I have to do them.
I'm using Zerotier so that my devices can stay in sync with my raspberry pi at home when I'm outside of my local network. I found this a simpler solution to dynamic DNS.
Theres an opportunity here for someone to build a "platform" that makes this all plug-n-play; like what the apple/google app stores have done but where the end user has control.
Something along the lines of someone buys some hardware with this platform on it and gets a gui that lets me install "apps" on top of it.
Personally, I've got a home setup that is on its way to what the op has; but I think there's demand from non-techy folks to get off the big co's apps and onto privacy focused ones that they control.
Can't believe Yunohost isn't more cited in this thread. I had to setup a whole productivity suite a few weeks ago, it took me less than 2 hours with everything right.
I am particularly impressed by the easiness of the update process.
Yup, the sibling comments mention a few alternatives (FreedomBox and Yunohost) but Sandstorm is really the only one I've ever used that makes me confident in the state of the system long-term. Let me elaborate on that.
FreedomBox and Yunohost use more traditional software installation mechanisms; they'll install packages, run scripts, etc. They just add (sometimes very nice) UI around it. While that's great for some things, after a while things can get a bit messy. For example: what about when a package installation fails for some reason? Or one of the configuration scripts fails? Well, you're stuck logging in and troubleshooting, which isn't super fun (and might be intractable for less technical users).
Sandstorm, though? Everything is sandboxed and isolated from the rest of the system. Everything. Backing up or restoring an instance of an app is a few clicks in a web interface. Sandstorm handles auth so the app doesn't have to... etc etc.
This has its downsides, namely that apps that aren't written with this sort of usage in mind might not fit in as well. But for those that are, it's by far the best experience I've had. I have Yunohost and FreedomBox servers in varying states of disrepair, but my Sandstorm server keeps chugging along. Big fan.
FWIW, there are places Sandstorm could improve here. Probably the biggest one for me is that Sandstorm backups do not happen automatically in the managed space. (You could automatically back up your Sandstorm server with another utility, and you can manually backup/restore individual grains in the web UI, but there isn't yet a really clean integrated way to restore grains inside Sandstorm.) But if this is the one thing you have to figure out outside of Sandstorm itself, that's not too bad (or unusual for many server applications).
Also, the parent suggests being able to offer a hardware box good-to-go, and I'd like Sandstorm to have that, or at least, a full distro release, where you do not have to worry about the server OS at all. It's something we've talked about quite a bit.
I'm a contributor, I wouldn't say I am "from Sandstorm" though. I actually looked at Restyaboard packaging a couple times, but the roadblock I hit is that there is currently no working example of a Sandstorm app running Postgres as the backend. I believe another contributor managed to get an app running using Postgres, but I don't know how they did it. I think there's some aspect of the Sandstorm sandbox that throws Postgres for a loop, and you have to kinda hack around it.
> I believe another contributor managed to get an app running using Postgres, but I don't know how they did it. I think there's some aspect of the Sandstorm sandbox that throws Postgres for a loop, and you have to kinda hack around it.
That would be me. I've done it on a private app and helped bring it to another app, so it's repeatable. I'll try to explain it on the sandstorm-dev group in the next week or so.
Eh? People who installed Sandstorm in 2014 are still getting regular security auto-updates today, even if they haven't touched their server between then and now. The very first app package ever built for Sandstorm -- created before Sandstorm was even announced publicly -- still works today, on the latest version of Sandstorm.
But what will you do if people aren't telling you exactly how to run your life and your setup? I certainly appreciate the effort and will be digging into this. I'm so sick of the tyranny. I've started my own 'disconnect' plan, and this is giving me a lot of ideas. I've already deleted Facebook, Amazon (that was a hard one), and well on my way to independence. Google is next, and like another commenter I'm using Proton mail now exclusively. Kudos for your efforts to help those of us that are really struggling right now - much appreciated.
I’ve been running Nextcloud on a DigitalOcean droplet, backed by S3 compatible storage from Wasabi for about 3 years now - it’s been pretty seamless. I think the old Nextcloud client syncing issues are a thing of the past (unless you work will really big files). Costs me $15/mo total.
My Nextcloud instance gets one-way synced using rclone to a NAS once daily, and one-way synced weekly as a tar archive to Onedrive (1TB storage from Office365 is otherwise unused, so...). The rclone setup is all with docker-compose + sops for rclone config, so I can just git clone and Docker-compose anywhere to get another machine backing up.
A nice addition is that the droplet serves as a WireGuard server that all my devices are pretty much always connected to (with split routing).
I host a couple of other services on the droplet including The Lounge for IRC, my personal website and a pastebin type app.
I chose Fastmail for Yubikey support, and unlimited mail aliases. A little dissatisfied with mail organization tools on the webmail GUI — but it’s also the client for alias creation, so gets frequent use anyway.
"A drinking game recommendation (careful, it may and probably will lead to alcoholism): take a shot every time you find out how someone’s data has been locked and their business was jeopardized because they didn’t own, or at least back up their data."
You could play a reverse game of every time somebody lost all of their data (or more probably, photos) because they owned everything and _thought_ they had backups too. (e.g.: when the OVH datacenter burned down)
I said "screw it" after my Oracle Cloud "always free" account was terminated with no recourse, a few days after having activity on the database building an application prototype, well under the resource limits. I'm now running a libvirt VM on my laptop to develop the prototype.
Others have complained about Oracle Cloud's draconian practices. Doesn't sound like a company that wants to build a cloud business.
I think they're making a classic BigCorp mistake: thinking they can just focus on the high-end customers. They don't realize that everything is connected in this industry, and today's hobbyists and sole proprietors are tomorrow's Fortune 500 VPs of Engineering (and vice versa).
Oracle is extra slimey with their "always-free" stuff though. It's free in the sense that you'll have salesmen after you and always as in maybe two months.
GCP on the other hand...have been running on their always-free tier for years no. (One of those 1/4 cpu wordpress VMs)
I'm doing something similar with a NUC that I colocated. $27/month for a gigabit port + 5 IPv4 addresses, and it's far more powerful than any VPS I could get for the same amount of money.
It was a little bit of work to set it up initially, but now I maybe spend 30 minutes a month making sure things are updated. Hosting my own wiki, DNS over HTTPS server, Matomo analytics, and a few other random services.
Wow, maybe my understanding of colocation costs is outdated...but $27/month sounds crazy inexpensive! May i ask @dervjd where/from which colo provider you are getting such costs???
EndOffice, out of Boston. Their website is somewhat of a mess, but they're legit. Been with them for 6+ months now with zero issues. I'll send you a message with details.
Isn't $55 a bit high in total cost? Aside from the 2 servers for projects, all of those aren't going to need entire servers just for 1 user. I've run Nextcloud doing all the same stuff for half that price and don't think Gitea or Monica would add much overhead.
I'm aiming to do a lot of the same (and more) but definitely aiming at a much lower monthly cost.
I run gitea on a Raspberry Pi and it works ok, along with a couple of other contains + nomad/consul client, it's just me using it and I've had no problems.
Tried the same some time ago. While setup is fun, maintenance etc. is mostly underestimated. Following Murphy's law, things mostly break in uncomfortable times (deadlines, etc.).
My (current) strategy: Do without the "last functionality" and stick with boring, local software/approaches. Not everything needs to be synced to / accessible from any device -- at least for me... One well backed-up machine, a few online services (e-mail, github for collaboration, ...) and long-proven applications like Photos.app. Something close to the situation 15 yrs before?
Not as easy though, I still need to figure backup strategy and everything. My goal is to eventually remove photos, and almost everything hosted entirely really.
I have a simple script that tars and gpg encrypts specific directories nightly. It’ll also reap backups in a sane way (only keep one per week for the last month, one per month for the last year, etc). And it’ll stop/start services/containers while backing up as needed. Then I distribute the backups to various devices using Syncthing.
I’ve been thinking about also having an off-site backup (for a house fire, electrical storm when everything is plugged in, etc), but that might be slightly paranoid.
You can still have a backup of your files and push them to another provider without self-hosting. It will take up 10x-100x your time to learn and use and maintain these alternatives, versus just taking a regular backup and using a managed provider.
It seems like 95% of the adherents to self-hosting do it as a hobby but pretend it's prudence.
Vultr seems entirely unnecessary in this picture (but the referral dollars probably help). They are just hosting stuff for themselves, right? The Synology can do all that (through VPN for the on-the-go devices). Separate VPSes for things like a 1-user Monica instance are insane.
I use AWS - the customer service seems great - I've personally received good service, the support life times are amazing (I used Simple DB).
These articles with "I wouldn't use AWS I'll leave it at that.." - be more specific!
For personal stuff ECS / fargate works well in my use cases. I put together a little docker and away I go - I pay for one reserved instance which saves money - fargate for stuff that is occasional or bursts (when I started fargate pricing was too high).
Docker is in some ways self documenting - I also have a home server setup complete with router etc - but someone is going to bump something at home and the reconfig / resetup time is much longer than with AWS.
The main thing about building all these private cloud setups, that bothered me most of time is security. It is not a big deal to take from GitHub and run all these bricks of your infrastructure, but how to maintain? Everything should be updated regularly, otherwise you risk to get your data dumped and leaked by some automatic crawler or home-grown hacker, once new vulnerability is discovered in any part of your tech stack.
The only easy solution I see is to hide everything in the private network and make accessible only under VPN. However, it is not that useful, when you need to get some file or read/reply email from some new device not owned by you.
Yeah this sounds like way too much maintenance overhead. I've opted in for a middle ground: I don't use FAANG/MS for anything critical and choose other hosted solutions instead.
Posteo was already mentioned, Mailbox.org is also nice for E-Mail with an own domain. I only had to set up the DNS records once and rarely have delivery problems.
Nextcloud doesn't need to be self-hosted either, there are many good providers.
As someone who has selfhosted for a couple of decades, i can understand the lure of it, but the author forgets to mention the huge effort it is to keep public servers available and free of unwanted visitors.
I've gone the other way. I had everything on a Synology box at home, backed up locally and remote, with a Proxmox server on a DMZ network, mounting all (data) storage from the Synology via Kerberized NFSv4 through the firewall, and exposing select services to the world (limited by IDS/IPS and geoip filtering)
I spent around 1-2 hours daily checking logs, installing patches, checking backups, and other sysadm maintenance jobs. When 2021 rolled around i decided i no longer wanted to be a sysadm in my spare time, so i quit.
Everything previously hosted at home was pushed to dedicated hosting providers for that type of service (pythonanywhere for django projects, etc). Not just VPS as that's essentially just self hosting on other peoples hardware.
Basic file synchronization went to Microsoft 365 Family. Sensitive data are manually encrypted with either LUKS or Encrypted Sparsebundles.
As for my Synology, i pushed all data on it to Jottacloud via rclone and the crypt backend. I then have a machine at home with a 1TB SSD acting as my "NAS", but in reality it's just mounting the Jottacloud data and using the 1TB SSD as a vfs cache. It then exposes the Jottacloud data through Samba.
The NAS handles backups of Jottacloud and Onedrive to a local 8TB USB drive. A remote machine wakes up once per day, mounts the cloud shares, and makes a backup as well.
In case i get locked out, it's just a matter of restoring one of the backups to whatever storage i have sitting around, and i'm back in business.
As for speed, the VFS cache really speeds things up. I get gigabit speeds on cached data, and even uncached data arrives in an acceptable pace (500/500 mbit connection), to the point that when i'm on Wifi (802.11ac Wave2)i can't tell the difference.
On top of having a lot less noise around me, i also save about 1/2 the cost of the self hosting hardware spread over a 5 year period.
Depending on what you need, a NAS + Syncthing is much simpler than the linked article. Building a PC isn't hard, and keeps prices down. These days, a RPi 4+2 USB HDDs would run circles around the motherboard on my NAS.
Syncthing is a great continuous backup solution. I use ~/NOTES as a scratchpad, and it updates automatically between my various computers. It gives you pretty granular control over shares, and I back up critical stuff to a cloud provider.
That said, there's no calendar/email/notes. XigmaNAS is built on FreeBSD, and will happily run NextCloud or a photo gallery or whatever.
I like the article and I agree with the sentiment.
I think that self-hosting can be quite a bit of effort, but a tool like Ansible makes it so much easier.
Whatever you choose to do, the most important thing is that you create data(base) backups and store those in an environment that you can control at all times.
There needs to be a viable exit strategy, just a backup is not enough if it takes more time to restore operations/service than is viable from a business perspective.
Perform at least a risk analysis, whatever you choose, make it a conscious, deliberate decision.
> Every last weekend of the month, I will manually backup all the data to Blu-ray discs. Not once, but twice. One copy goes to a safe storage space at home and the other one ends up at a completely different location.
The author has a lot more patience than I do. From their description of the NAS, they have at least 2TB capacity. At 50 GB per disk that is 40 Blu-Ray discs to reach 2TB and 80 discs to do it twice. There is no way I would spend a weekend very month burning and verifying 80 Blu-ray discs.
Hosting my own too. There's gmail as backup. But host my mail server, webmail, imap, smtp everthing.
Blocking spam isn't that problem. But making sure your mail goes to the receiver's inbox is.
You can block 90% of the spam by using only reverse DNS lookup -- doesn't match? Reject. 90% of the remaining can blocked using DKIM, SPF checks. No need for ip black hole check or spamassassin training.
The benefit : I can block a sender or his domain in a single click from webmail. Couldn't do that on gmail.
At the moment I don't filter on the server at all, because I think anyone should be able to reach me no matter how dumb the mail setup is. Send me mail by manually typing to a TCP:25 connection from a residential or mobile IP, I don't care. No DNS checks, etc. My postfix config is very permisive, only relaying is disabled.
I filter using bogofilter on emails delivered to my public addresses.
Private randomly generated aliases don't get filtered at all (only the sender knows them, so I just disable the address if it gets abused).
It works nicely, especially the private alias part.
I have alias/mailbox table in PostgreSQL DB, but don't bother with trying to connect postfix directly to the DB. I just dump the tables to the postmap files on each change. It's infinitely more performant and reliable, which is what this has to be.
I can also dump the DB to my MUA's config, and have it rewrite all the random addresses into something readable.
Definitively the wrong approach. I wrote this on another board:
> Everyone has 100Mbit lines now, a lot of people have gigabit fiber internet at home.
> You can get a Cortex-A55 TV Box for $30, plug in your old SSD drive via USB 3.0 with
> a $3 adapter, install Linux and you are ready to go. It consumes virtually no power.
> The processing speed and disk speed is incredible. Often the ping is lower than in a
> datacenter. This is not even the future of hosting. It has been around for quite some
> time. It is totally superior to any mid-range server. There literally are only advantages.
Pair this with Yunohost (via Docker). Yunohost is like an appstore for Linux servers. Easy 1-click setups for Nginx, Xampp, Postfix, Dovecot etc. that average people can do and understand.
You can still use the TVbox as a media center, even run Libreoffice on it and Blender like a small mini PC that has "poor but good enough" performance for most everyday tasks. Also games via Retroarch.
Sounds too awesome to be true? Yes, it is not quite true yet. You can do all this, but you still need to be tech savvy to step through it. And the media-center part is still questionable, because video drivers (the ones that work with hardware video acceleration) are bugged on most SOCs. Games work though, just not HD videos.
My HTPC gets parts handed down from my gaming PC.
I want to leave it "always on" anyway and so I was thinking of yunohost but you've now confirmed my path. Thanks!
"Is it worth the time and hassle? Only you can answer that for yourself."
No. Absolutley not. The little sys admin work I have to do at work is all that I want to. I trust Apple and Google with all my stuff -- icloud storage, passwords, Google for email, etc. It just works, and I can move on with my life and focus on things of value to me instead of worrying about an upgrade blowing things up, security patching, backups, etc.
I've just finished putting together some old machines and setting up my home cluster with k8s, and ported first app on it. Okish way to spend some of my Easter holiday.
Looks like the author is undecided on what to push next. Hardware - software "solutions" are not the issue, his definition of what his data is worth, to him, as to the pushers as part of an overview of how to stump the global masses is still opaque to the author. F** the data, it is the amassed, filtered, analysed dataset that is globbed over the wire that matters. If the author really has some content with rationality in-built, originality expressed, it is probably half an a4 page in hand-writing. That would be his back-up(so as not to forget what in a bright flash came up in his processor-mind, the once in his life-time), as it would be his legacy to the world. His billing and buying patterns, with his earnings defining his prodigy of consumption not power who cares? What the glob tells about similar individuals, that is what power minds.
Above as to repaint the context, really... this article is as close to a reduction to "nothing" as can be conceived.
I’ve found the following setup works well. It’s simpler, but less featureful:
Website is a git repo stored on a nas, and backed up. (GitHub would also work; private repos were scarce when I set this up). It’s published with “s3 sync”, and sits behind a cheap cdn.
Desktop is backed up to NAS (via NFS; would use syncthing if I was setting this up again. Previously, I used Unison, which confused some other users of the desktop, but I like it anyway.)
NAS uses synology’s client side encrypted HyperBackup to B2.
Calendar and contacts are on the nas, using baikal, which runs in a docker image on the synology. My phone is fine with periodic access to the contacts and calendar server, so this sits behind the firewall, and is not accessible via the internet.
Total monthly cost is pennies, not counting domain names, or the B2 backup data.
The main problem is that all the data will be compromised if the NAS is stolen. I’m looking for a good solution to that next.
As a middle ground, you can also simply use Hetzner's hosted Nextcloud offering, which is likely (a) more reliable and (b) cheaper than a self-hosted setup on a VPS.
Regarding sever hosting: Hetzner has a very attractive server auction on their website [0]. For about 30€ you can get several terrabytes with a fast cpu and plenty of RAM. No set-up fee either. These are unmanaged dedicated root servers.
Basically cancelled sever subscriptions are first offered here again before they take apart the server. So the offerings vary and are time-limited. However, if you pull the trigger there is no limit for how long you can keep using it.
Servers are in Germany or Finnland.
I am currently waiting for slightly better offerings (a few weeks ago when I found out there were slightly more attractive options) and then will pull the trigger.
I yet have to find anything that comes close to this bang-for-buck ratio.
Seconding this – Hetzner also has pretty decent technical support considering the price.
If you get one of their servers, make sure to check the manufacturer and type of hard drives and compare to Backblaze's Hard Drive Stats[1]. Also keep an eye on the S.M.A.R.T. status of your disks. In the past Hetzner has shipped servers with less-than-ideal hard disks (e.g. Seagate ST3000DM001). While their staff is pretty fast in replacing disks they do tend to start breaking all around the same time.
Oh thank you, that is good to know. Sadly before ordering a server from the sb, I cannot see the exact HDD models. It just says if it is or isn't the enterprise version and that is it, e.g. like:
That's right, although in my experience the staff at Hetzner were quick to replace faulty disks in the past. You should be fine if you keep an eye on your disks (e.g. SMART status reports via cron or systemd timers) and contact support before a disk fails entirely.
I think it's great that people are publishing their home server setups.
At the same time, the scary sounding warnings of "You're at risk if you put your trust in another company to hold your data!!" ring really hollow to me. I mean, does this person keep all of his money under his mattress, or does he put it in a bank (though I guess he could keep it all in crypto...)? Does he buy insurance, or again just keep a mountain of backup cash in a safe somewhere?
At the end of the day our entire economy is built around being able to trust other companies, and the systems in place to safeguard that trust. "I'll do it all myself" is essentially the process you see in third world countries where the systems are too fragile or corrupt to support that trust.
The systems that keep track of our money are (usually) very secure [0] and when they aren’t there is recourse to fix the damage. A thief can not use the stolen money if the transaction is reversed, but everyone on the planet can abuse your data once it is leaked.
[0] At least more secure than the systems those same companies use for their consumer data (see: Equifax).
Looking through the comments, I think it would be great if someone can bundle all of this into a product which automatically applies security uodates and offers some form of visual dashboard to see the status of the system, errors, and logs of attempts to compromise the system. Furthermore a migration tool from GMail/Google-Apps/Drive would be super useful (+ one for Microsofts offerings).
I believe many would be willing to pay for such a service, and I would be open to collaborate on building such a product.
I can see here things like:
- resale of hardware components and support agreements for paid subscriotions for the software
- paid setup support
- initial fee of the product
- small subscription fee for updates
If you really want control, what matters more is you having control of your own domain and encrypting what doesn't need to be public, such as backups and notes. Managing a self hosted system is often more expensive and more time consuming, and often those self hosted services store unencrypted versions of your data. But now you have to maintain the security of it yourself, usually worse than professional services, and your still one subpoena or hack away from it being exposed.
In the end you are still just as vulnerable getting booted off with VPSs like you are with google, but with domain control you can still switch hosts without losing your address, and you usually have customer support.
I created codehawke.com architecture from scratch to avoid hosting my content on other people's platforms. I make way more money than with platforms like Udemy. I think we should all be moving away from other people's platforms and tools.
I've been self hosting a few bits and bobs over the years (mainly gitea, FreshRSS reader, pihole, excalidraw and other custom services I've written)
Recently I've put together a little Nomad + Consul raspberry pi cluster (3 nodes) to schedule them all in docker containers, with each thing in its own job file. Traefik for routing and HTTPs, which nicely integrates with consul.
The cluster setup is all in ansible, which took a while to setup and fine tune but I think (hope?) it's in a good enough place to be able to rebuild the cluster in the event I mess anything up.
Clustering might be overkill but I like being able to deploy things through Nomad and it just working without much fuss.
I tried to have a setup similar to this during covid but ended up with a bit of a mess.
What I wanted was a home server that used X forwarding to forward services to my VPS, which also had some images running in a docker-compose stack that I wanted to have more robust uptime than my home server. I ended up being unable to get traefik to pick up on the x-forwarded ports, and ran into SSL certification issues that seemed insurmountable wrt hosting jellyfin this way.
Does anyone here use a hybrid home-server / VPS setup like this and know of a better setup? I prefer x port forwarding because I move about once a year and don't always have access to router settings
Yes, I use wireguard to link VPS with an array of computers in my home via point to point tunnels. This solves the "my home IP not being completely static" problem, because wg handles roaming quite gracefully.
And then I just use either DNAT or nginx reverse proxy to proxy https to some http ports at home, depending on the service.
Interesting! I've never used wireguard before. When you link it with the VPS, is it able to behave as if ports from your home network are running natively on the VPS?
I'll look into using DNAT/nginx, but I really do like having everything in a format where all the configuration is self contained in code and can be spun up / down easily, and I'm not sure if I can accomplish that using those tools
You'll see a wireguard network interface on all the connected devices, and you can configure some private address subnet on it, like 192.168.1.0/24 and give the devices some addresses from this range.
Then you can just talk between any of the devices via this subnet. Wireguard will securely tunnel the traffic.
Completely misleading title and this is basically an ad:
(* Links to Vultr contain my referral code, which means that if you choose to subscribe to Vultr after you clicked on that link, it will earn me a small commission.)
host yourself means - running on your own hardware
I successfully extricated myself from Gmail to ProtonMail, only to be getting dragged back to Office 365 due to ProtonMail not having a working calendar and FastMail not supporting calendar sharing (to non-FastMail users) or delegation.
yah, microsoft won office productivity 2 decades ago with exchange and outlook (and owa), not just excel and word. the integration between email, calendar, contacts, documents, and access control is still unmatched, certainly not by google's hodgepodge of web apps.
proton is working on calendaring but it still has a long ways to go.
It's fine for simple use, but without the ability to create calendar invitations it's far from the competition. (To say nothing of sharing e.g. free/busy with my work calendar or delegation.)
For personal use it seems I agree with other comments that it seems like alot of work. But in a corporate setting it could be useful, wonder if these types of applications (NextCloud) is how the cloud gets broken up eventually.
I've recently thought this would make a great business model. You set up a service where you deploy open source tools like email, picture storage, etc to run on aws lambdas for people. All they would need to supply is a domain name (via oauth access to dns providers) and an aws account. For a single user, the app's costs would probably be under a dollar for a year. They pay you a one time setup fee, and a maintenance fee only if they want to receive updates. Configure nightly backups for them, etc. I'd definitely pay if this existed already.
I like the article and many of the recommendations (and some others to look up). I do host some of these things but likely never all of them.
The post wasn't entirely clear on whether it was primarily privacy motivated or availability. If it's not about strict privacy, it's far easier to use whatever is convenient and still allows you to stream-replicate the data. For Gmail, I send a copy for accessibility outside of Gmail. The post itself includes offsite-backup so you could just start there if you consider your primary use site to be the 'onsite'.
the mobile app is crucial to me and its search and performance let me down when the car broke down and the time i needed it at the most at the hospital
I'm currently working on a MVP for a mobile app with a small Python server side component. In the past I would've spun it up on AWS or GCP but this time I've decided to challenge myself to see how cheaply I can validate my idea.
After a few hours of work I got it running on an old Raspberry Pi which I then exposed to the internet with some NAT rules and Duck DNS. Not sure how well this approach would work for something more complex but I'm very happy to have put some old hardware (previously in a box and gathering dust) to work again.
If there is one thing to take away it is this: VPS are cheap, something like 5$/month. Really consider having one, you will quickly use it more than you think.
When I was a student I wanted to test things on a distant server so I started renting a cheap OVH instance with SSH to test some silly ideas and host some static pages. It has been 20 years now and it hosts (one of) my backups, a professional website, several docker images, a gitolite and has saved me and colleagues numerous hassles when one of us has to share a few dozen GB of data.
I really love the idea of self-hosting, but man, you have to go through 9 layers of configuration hell and come back out alive. It's not necessarily fun programming - more of changing variables and running commands, which you might get wrong anyway.
I wonder if there's a viable business model for this.
Automate the setup through scripts and process automation for any provider. You pay a one time fee + a reasonable amount for maintenance and for resilience built in. I would pay for it if the price is reasonable.
I've been running Nextcloud myself, and I love it. I've been looking to expand my infrastructure even further -- the synology NAS are wonderful.
The biggest thing is that I don't think this matters anymore. Google, CloudFlare, and Amazon rule the internet. If they don't want you to be on the internet, it doesn't matter how resilient your infrastructure is. Especially when it comes to critical things, like email.
Is there a word or expression for this idea of not relying on big corporations for one's cyber presence, communications and other such tools?
I thought about info-independence, but I'm sure someone smarter than I already coined something better by now.
I know it is (always?) open source, but not everything open source liberates one from the cloud giants. So there's something there that needs a name, I think.
I'm willing to bet you could run all these services on a single VPS. Having to manage 6 different hosts is going to be a pain in the ass, even if you use something like ansible.
As far as backups, I don't understand why the author doesn't just encrypt them and send them to a cloud storage; it's what I'm doing personally with restic and it's not even expensive.
Same here, I host more than that on a single server at home for the family. Encrypted restic backups to Backblaze are so cheap, there's no reason not to do it.
> you should consider switching from... Google Maps to OpenStreetMap
I've looked into it, but there is very, very little in OpenStreetMap in my area. And I do not have the time, resources, or expertise to map out my entire area enough to make it useful myself. I would like to contribute to the project, but switching over entirely just isn't an option for me.
I love it. Some of these solutions are things I looked into during the early days of Android, before Google had cemented hegemony on so many things. Namely, Subsonic and K-9 Mail were some early contenders that I remember, although both quite clunky at that point (Subsonic very much had the patina of a one good developer, but no UI specialist, team).
20 years ago I've said to myself screw it, quit very well paid but nerve wrecking job in a software development company and never looked back. That's when I also went full remote ( I hire subcontractors but never felt need for an office ) and started hosting my own stuff on rented dedicated servers and in my own office.
I've had my own domain for something like 22 years now, but it's been a long time since I used it to actually host stuff. Email in particular I gave up over a decade ago and pointed at a hosting provider. I still read that email with mutt over ssh.
I prefer the lower overhead of having things hosted elsewhere, but keep regular backups and have a well detailed business continuity plan for each vendor that could go hostile, belly up, or otherwise no longer viable.
You take advantage of off-the-shelf options at the same time that you prepare for the worst.
The article appears to be complaining that free services don’t have good support, so the solution is to spend $55. Major providers do offer support plans. If google/Apple/Microsoft is so critical to your life and data, perhaps it’s worth paying more than zero dollars for?
A year ago, I tried to get into it, but :
- My ISP and Pihole didn't have proper IPv6 support.
- Even worse, Pihole requires phoning home to Github for updates... which I wanted to block with Pihole!
So I've shelved this idea for now...
One thing that would interest me: What about Ransomware? If everything is connected and synced, how to prevent getting everything encrypted before it is too late?
For me encrypted FreeNAS with readonly ZFS-Snapshots have been a good solution for this.
I’ll just mention I have been using vultr for about a year now and I love it.
There are no hidden charges and the service is just amazing as I use vulture to host my automated HN newsletter that delivers top news headlines straight to my inbox
Just curious why OP is using Nextcloud apps instead of those which arrives with Synology? Synology also has an alternative for notes, calendar, photos, etc.
I've got 4 servers and an app that monitors everything.
Daily backup is 30 days retention. Only had to setup once.
Weirdly enough, i don't have any maintenance. When I log in to create a new site, i see all the stats too.
It would cost me at least 18€*30 in the cloud ( amount of sites). I'm 100% sure self hiding for me is a lot cheaper.
I use Gmail and a box account too fyi. But i don't consider that "the cloud", it's a service that i use. Not something to deploy my own development on.
Ps. My uptime is better than a lot of services that is the cloud.
Also, his website is very slow, probably because he's not using a CDN. A noble goal, but it has an impact on credibility. The slow website makes me feel like he doesn't care about user experience, which makes me assume that is true for his whole setup, and turns me off from even considering it.
That’s a massive jump to make when he is probably being beaten to hell by HN right now.
That’s not a reasonable chain of expectation. Reevaluate your logic.
"HN doesn't make that much traffic"
Cite your sources, that claim sounds made up. Your inference that if somebody doesn't use a CDN or have acceptable load times for you, that they don't care about user experience, and so their opinion on self hosting is not worth listening to is absurd.
They said they got 18,000 hits in one day. That's a tiny amount of traffic for any decent static website (which this one that we are talking about is). Even assuming they got all that traffic in one hour, that's only 5 requests per second.
Poking around the source code I see the time on the cache is after this comment so I feel the issue is more likely cache configuration wasn't prepared for frontpage of HN. Anecdotally, I'm not experiencing any performance issues anymore.
I'm assuming that if he doesn't care about his website being performant for others, which is its main purpose, then he doesn't care about his other apps doing their main job well either.
At least he has a fighting chance to talk to a human at a local company. You won't get that from Google. Even if you pay for Google One you can still get blown off with "I've already given you all the information I have" (nothing) because telling you why you actually got locked out opens them up to a discussion they are not interested in having.
The most valuable thing for me is my photo library. All of them are currently in Google Photos. Is there any easy way to backup just that? I don’t care about my personal email, tasks, calendar etc. It’s just the thought of losing my photos scares me.
I mentioned it in a different comment, but take a look at Syncthing. It does mesh-style backup to synchronize a folder between multiple machines. That provides robustness against hard drive or PC failures, and it's easy to add an offsite node for extra confidence.
https://takeout.google.com/ is exactly what you want. Deselect all, check "Google Photos", click Next, chose your archive format, confirm. It'll take some time but at the end of the process, you got a nice zip with all your Photos in original quality.
I found a solution: Use Photos (iCloud photos) along with Google Photos since I already use an iPhone. Thanks for the comments, but I believe this is the easiest.
In addition, traditional non-tech companies screw people on a regular basis. I know I am resorting to whataboutism, but let us not panic and try to build our own cloud. One has to consider what happens and when one gets decapitated in a autonomous driving accident and the family is left with a home-made cloud
There's nothing exaggerated about that story. Apple cut off access to his accounts and services because of a payment issue (which wasn't even his fault).
Even if it was his fault, and he just decided not to make that payment, the story still illustrates how much power Apple has.
It's like if you missed a car payment, and the bank used their connections to cut off your cell phone, water, electricity, etc until you paid. Whether or not missing a payment is immoral/wrong, giving a private company so much power over an individual's life is absurd. That's some mafia, break your knee caps with a baseball bat-type shit.
These days, I design everything for home with extreme simplicity coupled with detailed documentation on how I set things up.
Docker has helped tremendously, since you can essentially use an out-of-the-box Linux distro with docker installed, and you don't really have to install anything else on the hardware. Then if at all possible, I use standard docker images provided by the software developer with no modifications (maybe some small tweaks in a docker-compose file to map to local resources).
Anyway, my advice is to keep the number of customizations to a bare minimum, minimize the number of moving parts in your home solutions, document everything you do (starting with installing the OS all the way through configuring your applications), capture as much of the configuration as you can in declarative formats (like docker compose files), back up all your data, and just as importantly, back up every single configuration file.