The problem I have always had when building elaborate home server setups is the "set it and forget it" nature of the systems I've installed bites me in the ass. Since it's not my full-time job to manage these systems, I'm really not familiar with them the way I might be with the systems I manage at work. These systems cruise along for years, and when something finally does go belly-up, I can't remember how I set it up in the first place. Now I have a giant chore looming over me, ruining a perfectly good weekend.
These days, I design everything for home with extreme simplicity coupled with detailed documentation on how I set things up.
Docker has helped tremendously, since you can essentially use an out-of-the-box Linux distro with docker installed, and you don't really have to install anything else on the hardware. Then if at all possible, I use standard docker images provided by the software developer with no modifications (maybe some small tweaks in a docker-compose file to map to local resources).
Anyway, my advice is to keep the number of customizations to a bare minimum, minimize the number of moving parts in your home solutions, document everything you do (starting with installing the OS all the way through configuring your applications), capture as much of the configuration as you can in declarative formats (like docker compose files), back up all your data, and just as importantly, back up every single configuration file.
The author focuses the entire blog post on remote third party services that are alternatives to popular third party services financed by data collection as a "business model". IMO, the single most important component of a home network is not any piece of the hardware/software outside the home that the third parties may control, it is the internet gateway in the home. Routers were the most important computers at the dawn of the internet, and IMO they still are the most important computers today. If the internet gateway in the home is ignored as a point of control,^1 then IMO all bets are off.
A significant amount of data collection by third parties can be eliminated or reduced by retaining control over the internet gateway. Arguably this amount is even greater than what can be affected by simply switching to using carefully selected alternative third parties. IMO, it is a mistake to believe that one can reliably eliminate/reduce data collection simply by choosing the "right" third parties. Whack-A-Mole, cat-and-mouse, whatever the term we use, this is a game the user cannot win. Third parties providing "services" over the internet are outside the user's control. For worse not better, they are subjected to market forces that drive them to collect as much user data as they can get away with.
Regardless of these privacy-destructive market forces, it is still possible to build decent routers from BSD project source code and inexpensive hardware. IMO, this is time well spent.
Most of the data passing it are encrypted: https, SSH.
Cutting off the phone-home requests is best done on respective devices: you can run firewalls on most desktops and laptops, and even phones. Рhones often go online via GSM or LTE, without passing through the home router.
While a proxy like pihole can be helpful sometimes, cutting off tracking and ads is done best by browser extensions and by using open-source clients, where available.
The best the home router should do is to not be vulnerable to exploits, and otherwise up-to-date, and fully under the owner's control That's why my home router runs openwrt.
"... cutting off tracking and ads is best done by browser extensions ..."
What if the browser vendor, who is also a data collector, requires user to log in or otherwise identify herself before she can use extensions.
A home "gateway" is a computer running a kernel with IP forwarding enabled that is being used as the point of egress from the home network to the internet. That is a broad definition and allows for much creativity. That is what I mean by the term "gateway". As such, a gateway can, both in theory and in practice, do anything/nothing that "desktops and laptops, and even phones" can do. Relying solely on pre-configured "limited/special purpose" OS projects as a replacement for DIY and creativity in setting up a gateway was not what I had in mind, but is certainly an option amongst many others.
Don't use such a browser then! Firefox is pretty good.
An on-device firewall can firewall individual processes and applications. An upstream / gateway firewall does not have such fine-grained control. That was my point.
Running stuff on my router is entirely possible, but I limit it to routing and running a wireguard endpoint. I prefer to run my private stuff in the confines of the home LAN.
I use a text-only browser for reading HTML. Many times I do even use a browser for making HTTP requests.
Truly one can use all these strategies, application-based, router-based, gateway-based, if they are available. They are not mutually exclusive. Personally I just would not feel like I can rely on extensions or other solutions tied to some software I do not compile myself. (I do edit and compile the text-only browser.)
All due respect to Firefox, but I have found compiling it is way too time and resource-intensive. It is way beyond what I need for recreational web use. Firefox users seem to rely on Mozilla to do the right things on their behalf. That is not the sort of "control by the user" I am after.
What is behind Mozilla. Online advertising money. Cannot really count on them to do what I want.
For me it has not. But it all depends on what sites one is interested in. Generally I do not tend to find much value in SPAs when I come across them. If there is some data accessed through the page that I really am interested in, I just fnd the endpoint and bypass the Javascript. Most sites posted to HN are not SPAs, do not require Javascript to read, and work perfectly for me. I can glean the information no problem, and fast. However I am interesting reading text not viewing graphics. Not every user has the same sensibilities.
I have always aimed to try them all and I will always try any new one I become aware of. I used Lynx from 1993 through the early 2000's, but since then I prefer Links from Charles University in Prague, with no graphics. It has the best rendering of HTML tables of any text-only browser, IMO. It has been the most stable for me at run-time and it is the source code I am most comfortable with, compiles quickly. There is a recent fork of Elinks 4.x someone has started that I have been watching. (Elinks is a fork of Links.) Not sure if it has been posted to HN yet. Currently it crashes too easily but some of the features he has added are good ideas, IMO.
Okay. How can we fix this? I'm dealing with it right now and this space is so hard -- likely somewhat deliberately so. I'm a 20+ year Linux user trying to get a single home network with multiple ISPs going and it just seems way harder than it ought to be; i.e. -- not that every bit of software needs to be idiot-proof, but this iptables/pfSense/netplan etc etc universe just feels downright hostile to the aspiring home user.
Multi-wan is easier with appliances. I used pfSense over the last 12 years or so with multi-wan on and off (currently off). I've run pfSense in a kvm VM, and you can do multi-wan with this. Though I generally recommend dedicated NICs for the WANs and LAN.
I've looked at the linux based appliances (as late as last week) and only clearos or openwrt supported multi-wan. I could be wrong (I'd like to be as pfSense/OPNsense are FreeBSD based, and that comes with, sadly, huge amounts of baggage, limited hardware support, etc.). I'll likely be looking at that package as a potential replacement for the pfSense system, though if clearos can't handle what I need, OPNsense is like pfSense, but with far less baggage.
If you don't mind tinkering, you might be able to use mwan3[1].
If you prefer OpenWRT, you can look at running it in a VM[2] along with mwan3.
I set up multi-WAN in a few small business contexts (on a Zyxel appliance, and on pfSense) and to be honest it was always a little flakier than I hoped. It was always failing over too quickly, or too slowly, or not reverting back to the primary WAN quickly enough when the failure/congestion cleared up. I think any sort of multi-WAN setup where you aren't actually an AS running BGP is doomed to be somewhat hacky. Of course running an AS is both a ton of work, and basically impossible for a residential customer anyway.
It is uniquely suited to run services without interruption, like file share, messengers, torrent, http server, etc. Smartphone and laptops are mobile devices with spotty connection, powered by battery.
Gateway, not router. I personally do very few and simple things on the gateway. Way less than what an average default-configured router running OpenWRT does.
No worries. Usually they are the same. The terms are probably interchangeable. The point I was trying to make is that the gateway does not necessarily have to be a pre-packaged router with a difficult-to-manage, complex configuration. It can be the computer over which the user feels she has the most control.
No, I'm in the process of moving to OPNsense. Unlike pfsense it's actually open source, not controlled by a company that tries to drive competition out with dirty tricks and they aren't trying to move their customers to a DRM solution.
How hard has it been to move to opnsense? I've been dragging my feet on doing it because my pfsense router is setup just where it needs to be but i really do want to move.
Yep. This will cruise along longer than the parent's solution, but when it breaks, you'll be starting over all of the original services from scratch plus the management system you had built once to manage them.
But it only breaks when all systems fail together; if your router fails, you can rebuild it from the gitlab job. If the VM host fails, you have time to replace it because the rest of your network still functions. If your git host fails, same thing, but where did you put those backups?
My solution is Kubernetes. Everything's configured in YAML files. The solution to all those problems is... change fields in YAML files.
Of course, you need to figure out what you need to change and why, but you'll never not need to do this, if you're rolling your own infra. K8s allows you to roll a lot more of the contextual stuff into the system.
Do you find there to be a good amount of overhead in running your own Kubernetes cluster? I'd think initial setup would be a bit of work, and then keeping the cluster updated and patched would be a good amount of work as well.
Then you've just traded maintaining one system for maintaining another.
Just started this journey myself and while there’s tons to learn, getting something up and running using k3os and flux cd takes no time at all and gets you a working cluster synced to your repo. K3s is pretty light, I know some people running it on pis.
If you use hosted Kubernetes (GKE, EKS, etc) then you don't need to deal with any of that, which is nice. You get the flexibility of Kubernetes without needing to care about any of the underlying infra
Once you learn it it's pretty straightforward. K8s has a very simple underlying architecture. It's intimidating at first, but yields to study and care.
I have also been using Kubernetes for this for years now. My favorite property is that it will run forever, no matter what happens.
The annoying part is that when I do want to do updates (i.e. updating cert-manager from 0.1.x to 1.0.x, etc) it can be a pain. So I save these large updates for once a year or so.
The solution is keeping a local mirror of all images and artifacts, and version pinning for stability (along with a periodic revision of version numbers to the latest stable version).
Oh and don't forget that now maybe you make everything work, but in two years time your setup won't be reproducible, because chances are the original images are not available any more, they got deleted from Docker Hub some months after you used them. Yeah, you should update them anyway for security... but the setup itself is not reproducible, and being forced to use the latest version of something, with the new idiosyncrasies it might bring, is not a nice situation to be in when you just want to hurry up and resolve your downtime.
So I guess that's one more thing to worry about it seems, maintaining your own images repository!
Maybe, but when the original docker image is no longer available on docker hub, chances are there will be something better and even easier to setup. And with docker you don't care about installing / uninstalling apps and figuring out where that obscure setting was hidden - all you need is just a stock distro and a bunch of docker-compose.yml files, plus some mounted directories with the actual data.
But a lot of those unofficial docker images are of unknown quality and could easily contain trojans. It's completely different from installing a package from your distro.
On the plus side, the Dockerfile and the repo with the scripts used to build a container is usually available. If you don't trust it, read through the source and rebuild it. Or just stick to official containers, no matter how terrible they are.
Even if so you're still spending say 50% of the original time investment every year or so just maintaining it. Unfortunately your options seem to be "set up once then never touch it again" or "update everything regularly and be at the mercy of everything changing and breaking at random times".
I mean, you should always have a backup of your dependencies (up to reason).
I develop mobile applications, and use SonarType's Nexus repository storage as my primary dependency resolver. Everytime I fetch a new dependency it gets cached.
A monthly script then takes care of clearing out any cached dependencies which are not listed in any tagged version of my applications.
Agree that documentation is key here. Anything you do that is beyond the vanilla "pave the install and plug it in" should be written down.
It doesn't need to be perfect - I have a onenote notebook that has the customizations that I've done to my router (static IP leases and edits to /etc/config/network), and some helper docs for a local Zabbix install in docker that I have. I recently how to migrate a database from one docker image to another and there is no way I would remember how to do that for the next time, so I wrote everything I learned down.
Just a simple copy/paste and some explanatory text is usually good enough. Anything more complex (e.g., mirroring config files in github) still (IMO) needs enough bootstrap documentation because unless you're working with it daily you're going to forget how your stuff works.
Additionally a part of my brain is worried that if I get hit by a bus my wife/kids will have a hell of a time figuring out what I did to the network. Onenote won't help them there but I haven't figured out the best way of dealing with this.
(I recognize the irony in a "I'll host it myself" post in storing stuff in onedrive with onenote but oh well)
Just to throw more products at the wall, I've been using Bookstack[0] for the same sort of documentation.
Besides being relatively lightweight and simple to setup, out-of-the-box draw.io integration is nice. Makes diagramming networks and other things dead simple. And I know "dead simple" means I'm infinitely more likely to actually do it.
I also started doing something similar via org-files, git, emacs, and Working Copy. It has worked pretty well, though Working Copy (the iOS git client) was buggier than I expected (but they have a great developer and support). My network isn't very good, or I'd just use emacs on iOS via SSH via Blink.
I work on trying to script each install. So if I need to repave, I have a documented, working script, and the source bits to work with.
I've preferred VMs for functional appliances for a while now. I like the isolation compared to containers. Though YMMV.
Right now, the hardest migration I have is my mail system, which makes use of a fairly powerful pipeline of filters in various postfix connected services. Its not fragile, but it is hard to debug.
I host it myself, as the core thesis of the article pointed out, you can be deplatformed, for any reason, with no recourse. And if you lose your mail, you are probably in a world of hurt.
The one thing I am concerned about is long term backup. I need a cold storage capability of a few 10s of TB, that won't blow up my costs too badly. Likely the best route will be a pair of servers at different DCs, running minio or similar behind a VPN that I can rsync to every now and then. Or same servers with zfs and zfs send/recv.
Thinking about this, but still not sure what to do.
I've got the same problem, I have an ubuntu fileserver I set up a few years ago, but have none of the monitoring/alerting that I'd have if I set it up for my day job. And really, I don't want to do my day job at home.
So it's a bit of a catch-22, I want a secure and stable home system, I don't want to spend much time working on it, but I want full flexibility to install and run what I want, and don't want to trust some off the shelf consumer solution that's likely going to be out of support in a couple years.
I just use a docker-compose stack (one yaml file next to a bunch of subdirectories) templated out in Ansible.
Ansible will be around for a while, but even if it's not its (yaml) syntax is incredibly easy to read. Any successor in that area is somewhat likely to have compatibility or at least a migration path.
This together reaps the benefits of Docker (enhanced through Compose), and Ansible is documentation in itself. There's barely any actual comments. The code speaks for itself. Also, I can reproduce my stack with incredible ease.
This right here. I recently lost my home server of 10 years courtesy of the Texas power issues during the winter storm. I rebuilt and started with fresh Linux install. Having a recent backup of /etc made it so much easier than it could have been. I had more trouble with the network driver on the new mb then with all my services, customizations and data.
Yikes, did you have it on UPS? How did a power issue kill your server? A big spike? Most filesystems like XFS should be able to fsck even after a power drop, especially with RAID. (I prefer RAID-10 for speed of rebuilds.)
Surge protector but the spikes were huge. My drives we're actually fine what got killed was the PSU and it was a case integrated PSU, in a system I had been meaning to upgrade for a while anyway. So this gave me the kick in the butt to finally do it. I also went to NVMe for the root drive wow what a difference.
The system that died was not out of date software wise it had been dutifully upgraded with every major Debian release so starting on a fresh install, adding packages and copying conf files from my backups, and adding my data drives really wasn't bad. Couple of days after work to do the hw build and restore.
Because of shortages / cryptomining / covid I couldn't get the HW I really wanted, that was frustrating.
Eventually, whatever platform / tool you use will need to be upgraded. Security vulnerabilities, new features, etc happen and projects like these can get abandoned within a 5-year timeframe. When you have to migrate to either a new or upgraded platform, you have to figure it all out yourself. When the config is broken by an upstream dependency, you’re on the hook too. Who knows if the build tools you used still work on current versions of things.
Like it or not, we’re all kind of stuck on these platforms we don’t control. The alternative is to become fluent in yet another technical stack, but one that will be used infrequently and won’t really translate to anything else unless you’re trying to build your own cloud service on consumer-grade hardware.
Today I've heard fellow SREs discussed whether RRDTool is the best solution for monitoring private things. Its only merit: it stopped evolving. Might or might not outweigh the decades of progress.
RRDTool is great for exactly this reason! I actually still use it to plot time series data on a raspberry pi for various projects. It hasn’t changed in at least a decade and it will run with good performance on any hardware. If it ain’t broke, don’t fix it
Long term platform stability seems to be moving from relying on the OS to higher up in the stack with the advent of Docker. It feels like Docker is being used more and more in cases where I would have considered something like CentOS.
Docker feels like the equivalent of the teenager that doesn't want to clean his room so he just pushes all his mess under the bed.
The complexity is still under the bed. We're all going to have to dig under that bed one day. Or we're just going to end up buying new hockey sticks, football pads, etc. Which is to say, we're going to end up with Linux on top of Docker on top of Linux.
I'm not sure I follow the analogy. I'm guessing the mess represents a messy filesystem/environment. What does the bed represent? A Dockerfile? Dockerfiles are way more transparent than, say, a random server's / directory.
That's just my opinion of course. It's possible to write confusing Dockerfiles but really they're mostly just shell scripts. And the idea of "Linux on top of Docker" seems a bit odd - there's only ever one Linux kernel no matter how many containers you have running. Docker is built on Linux.
It's closer to owning a car factory making disposable cars and getting a new one when it gets dirty. At some point your supplier shuts down and you cannot make a specific part anymore, which prevents you from building another car. You are now stuck with the dirty car.
If your application doesn't rely on hardware details (such as GPU acceleration or networking) that bypasses well-defined OS APIs, it's actually a very good approach.
But when you do Kubernetes or Docker, there’s almost always another stack beneath you that you have to build and maintain. Whether that’s bare metal (yikes), VMWare ($$$) or AWS/Azure (not self-hosting) you still have to deal with upgrades / API changes / hardware refresh there.
Container solutions really only get you out of “doing the work” if you can leverage a prepackaged container management solution from a cloud provider. Self-hosted containers are frequently more trouble than they’re worth since the solutions for managing them are either insanely complex (Kubernetes) or so simplistic you have to write some custom logic to build/deploy them.
Agree with the point about CentOS though; at this point the idea of a Linux “distro” is dead. The way forward is a hypervisor model where the kernel is protected from application code (including all dependencies from the userland) with barebones Docker images like Alpine used as the basis for a declaratively-defined system. The one thing Docker does very well is isolate dependencies which reduces integration complexity and lets you modularuse the whole thing. Incidentally this actually makes it harder for hobbyists to get into as the mental model for it has a lot more complexity and you need more expert knowledge to write the scripts to build and configure your app automatically.
Seems like there's a probably reasonable trend of piling some other tool on top of the stack because dealing with underlying layers is hard. Like electron apps and docker images. Or just web browsers.
Kind of worrisome to abandon lower layers with their problems and build on top of them, but what can you do, but get good at jenga.
I feel the same, however I force myself to keep doing that. Without docker.
Yes, every time a new debian release is out something I fiddled with will break and I have to remember how it works, but I see it as some sort of training. As a dev I don't want to be completely clueless about how things I use every day work, so while changes that break old configs and workflows are annoying I'm forced to learn what changed in the gnu/linux/debian world and maybe even find out why.
Also I get better at documenting things. Years ago I didn't even know what exactly to document since the moment you do something everything just naturally comes together, but after a couple times you kinda get a feeling for what the important bits will be two years down the line.
So about once a year I reserve a perfectly good weekend to upgrade, restructure or otherwise maintain my little home server running debian with things like mariadb, nginx, a filtering bridge for my lan, dnsmasq with a block list, borgbackup, syncthing, cups for printing and a couple other things I don't remember right now.
I encounter this every 6-12 months when I go back to an old project that is 'working' and want to add something/update something and it all just looks foreign to me.
The worst thing is that I have often gone through a lot of effort around making it easy to set up and deploy (docker and whatnot) but even that I have forgotten about. (I came across a docker file in an old project and couldn't get it to work properly until I noticed that there was a docker compose file lying around that I had missed)
How do you keep track of documentation? I guess for a project a README in the git's root is a good start, but what about more complex systems stuff that does not live in a git project? For example, I had to manually edit a bunch of config files on my Proxmox setup to get docker and some other things to work properly. Where would I document such manual steps? I am thinking a text file somewhere in cloud storage but then of course I'd need to remember that...
My doc is split between Notion (for bigger, structured projects) and a bunch of local md files (for general, greppable knowledge).
For my VPS, I have a Notion page where each project (name, url, mapped ports) is a row in a table. Then the project page contains a copy of my docker config and various informations I might need for maintenance/reinstallation.
I don’t try to put the documentation next to the thing being described because then I’ll lose track of it. I set up a simple Gollum wiki and put everything there. That way I get md for useful formatting, version control on the docs without having to create a zillion git repos, and I never have to wonder where the docs are.
Nix helps with this. Or at least it intends to. I still need to track down what's vulnerable and what isn't, but most of my setup is reproducible thanks to Nix.
I've read people admitting that Nix can give you a lot of work from time to time, when something isn't already available for Nix (which happens more often than, say, for Debian), and you want to add it to your system. Is that true in your experience? I may be phrasing it wrong.
I found that the nix learning curve to be steep but short. The first time you need to make something available for nix (particularly a service), you'll probably copy something someone else wrote, make a few changes and then (if it didn't work) stare at your screen for a while wondering why it didn't work[1].
It is very different which can be very off-putting, but is not usually gratuitously different, and once you get used to it, it's pretty straightforward.
After having done it a few times, I find that I can adapt a random project not already in Nixpkgs for nix in under an hour, and it's something I do maybe twice a year or so.
One counterintuitive advantage I found switching to nix from other systems is that since the 4 step "download, configure, make, make install" usually doesn't work, I take the time to make a nix expression. On Gentoo and Arch, I would often just install to /usr/local from source and then forget what I had installed and not know how to upgrade it. If you have more discipline than I do, then it's a bug not a feature, but for me it's super helpful.
1: If the project uses cmake or autotools and has no strange dependencies, then packaging it for nix is trivial. However a surprising number of packages do things like downloading dependencies from the internet at build time, and it's not always immediately obvious how to adapt that to nix. Projects using npm or pip also probably won't work right away just because the long-tail of dependencies means that there will be at least one dependency that isn't already in nixpkgs (haskell should in theory be just as bad, but the strange proclivity for haskellers to use nix means that someone has probably already done the work for you).
Just firewall off all management interfaces and allow via IP as needed. It's still possible your webserver will get become vulnerable, but you'll prob here about it here if it is.
I almost guarantee that when you need to make a change to your home server in 10 years, docker compose will no longer be able to pull any images from docker hub without being upgraded, and upon upgrading you'll find all your config is now invalid and no longer supported...
> when something finally does go belly-up, I can't remember how I set it up in the first place
Why should you need to remember it? Like you wrote later on, you just " document everything you do", as you do it. That's better than any sort of script or version control, since you can describe it how you like, which means quite succinctly. And it's not too difficult to adopt the mindframe of "What would I have needed to know 10 minutes ago to understand what to do".
> Docker has helped tremendously, since you can essentially use an out-of-the-box Linux distro with docker installed, and you don't really have to install anything else on the hardware
This actually illustrates, IMHO, how containers and docker are overused. You're talking about a single machine with a single purpose/set-of-purposes, and no occasional switching of configurations. So - why containerize? Whatever you have on Docker, just have that on your actual system.
> my advice is to keep the number of customizations to a bare minimum
Sound advice based on my own (limited) experience with self-hosted home-servers.
> and just as importantly, back up every single configuration file.
Fine, but don't rely on this too much. It's always a pain, if at all possible, to restore stuff based on the config files. Usually easier to follow your self-instructions for configuration.
I think it's important to document anything, be it at home or for business. I've done good and not so good jobs with this at times, and you feel like a hero when it's well done and you suffer through reinventing things if it's poorly done.
I agree you should have config files and backups, but like with work, I think it would be good to go through a "disaster" where you have to build a version of your config from an unconfigured environment.
A colleague had built such a config for a small office (10 people), but it depended on a specific DNS that ... wasn't set up until 1/2 way through the config. It worked because in testing they were using the network with a DNS that was already working. Small things like that are hard to catch in a working environment.
What I sometimes do is make a bash script that does all the setup. save it in the home folder. You can just copy/paste each line from the script to set things up again and you'll be able to know exactly what you did to the system later on.
> These days, I design everything for home with extreme simplicity coupled with detailed documentation on how I set things up.
Definitely a good strategy, yeah. And on the documentation front, that's exactly what I've started doing differently, too; after having to recreate my mail server one too many times, I eventually decided "you know what, I should probably write down what I'm doing", and now if things go belly-up again I at least have that starting point.
As a bonus, you can also put it online as a tutorial, which is exactly what I did: https://mail.yellowapple.us :)
I am self hosted since forever or I will rather say - since ADSL was available. The server has changed "a bit" during the last 20 years (oh, Mercury mail server :D), but last "few" years (from FreeBSD 8 :D) I am happy camper there. Just using it casually and learn a thing or two by the way. Upgrading it, migrating (2 times until now), adding disks. Nothing special.
Just got into situation where mobo has started failing and I have said it is time to reinstall, bought new hardware and throw it together. Migrated in a week, everything that was customized during ~20 years without taking notes (but with my good ol trusty diffing software) and was migrated from previous server, optimized a bit, removed unnecessary settings, upgraded postfix and dovecot to new server, replaced spamassasin with rspamd, php-fmt,...
The server was mostly operational in 2 days. Everything else was studying new software, doing things that i always wanted to but didnt want to turn around whole configuration (like stuffing everything into jails), customizing netdata, upgrading database/nextcloud/... etc. It would take longer if I would loose the data, but I trust zraid and my LTO drive, they never failed me.
Now I will sniff around it every week, maybe run some update etc. and I will be fine until some disk fails.
Being system administrator is not my occupation, but it helps if you NEVER EVER become a hostage to a cloud provider. The more you go into leisure of someone else doing everything for you - the less chances you will have to learn new things and the more you will be on mercy of someone else. And the longer you are enjoying such situation, the more technology progresses and the bigger the gap between the knowledge you have and the knowledge needed to, in this case, setup a server.
I remember dreaming in 1991, as a kid, that in 20 years everyone will have his own server at home, and how we will transfer files simply. But now everyone is saying they don't have time and buy ready made boxes or pay for the cloud.
The technology came, now it is simpler to use than ever.
But no one has "time" for that now. Or is really the time?
---
Dont take having a home server as a pain. It is a great way to learn things.
I can bake a perfect bread too. "Kicked" all my girlfriends out of kitchen. Brewing home beer. ...
But all this wouldn't happen if I would rather go to bakery. Eat in restaurants. Buying beer in a store. ... Sure I could, but I didnt.
You don’t even need Docker. Another great technology to learn is LXC and driving it from the CLI / a script.
All my hosts run a default OS with a 20 line firewall and a bridge.
The top level host has a zpool backed by a blank file in /tank.device.
The actual work is done by a bunch of LXC hosts all cloned from a standard base installation. Anything persistent goes in a per container zfs filesystem mounted in each container’s root. The only thing I ever backup is the /tank.device file.
Wiping everything and building from scratch is a pleasure.
I personally found Docker and Compose pretty simple to learn, and after years working with them I'm pretty well acquainted, but am interested to learn if there are any tangible benefits of using LXC instead of Docker?
Docker has nice image caching and fetching. It’s all about deploying and containerising in one step, usually a one line shell script alongside the container config.
LXC is more like deploying disposable OS instances and then having to deploy your app yourself, separately. It’s more bare metal but has fewer surprises in terms of supporting IPv6 and not having any inscrutable iptables magic happening on the host OS.
Docker is IKEA. LXC is hiring a joiner. (Not a value judgement.)
If you want stability I would go with a custom OS, eg. no auto updates, only update parts of the system if you need/want to. And apply security/hardening in several layers. Make backups of course. Then you could leave it running, with really nothing to worry about except hardware failure. The problem is that when the hardware do fail, it might be difficult to get a CPU with the same architecture. (like 50 years from now)
I guess this goes back to if you are DIY person or not.
For me i treat that just like I do any service in my home. I am the type that will tear about my Dryer to fix it vs buying new or bring in a repair person.
I repair my own car, appliances, etc. For me the Home server is the same.
I have been through this too way too many times. Every method I found either were as bad or added more problems (like docker, configs in git, etc.). Now I have removed almost all customisation/configuration from services I need and exchanged them with what are basically default setups of containers in proxmox. Everything is backed up and the amount of configuration needed if something did somehow go away is measured in minutes instead of hours. Luckily I don't need remote access which makes this easy.
The diagram alone is more than enough of an argument to dissuade me from giving this a shot right now - it's simply too complicated and too much to manage for the amount of time I can dedicate to it.
BUT - I'm really thankful for people who keep posting and sharing these sorts of projects; they're the ones iterating the process for the rest of us who need something a bit more turn-key.
I'm excited to see this eventually result in something like the following:
- Standard / Easy to update containerized setup.
- Out of the box multi-location syncs (e.g. home, VPS, etc.)
- Takes 5 minutes to configure/add new locations
I want this to be as easy as adding a new AP to my mesh wifi system at home: plug it in, open the app, name the AP, and click "Done".
I think do a little at a time and keep at it. Over time it adds up.
At sometime you will hit something interesting: Personal Sovereignty.
I've seen other folks hit this in weird ways.
My friend started working on cars with his buddy. They finally got to an old vehicle they took all the way apart and put it together. He had gotten to the point where he could pull the engine and put it on a stand, weld things, paint, redo the wiring harness.
I remember one day I went and looked at it and he sort of casually said, "I can do anything".
Anyway, I think the diagram says something else to me. It says he understands what his setup does enough to show it/explain it to someone else.
I had this with my bicycle at some point -- learning to fix and tweak oneself without having to go to a mechanic was eye-opening. Reminds me of the core premises in Zen and the Art of Motorcycle Maintenance.
I think the diagram gives a skewed view of how hard this actually is.
I run a very similar setup only my VPS is only a proxy for my home server and it requires very little maintenance. I run everything with docker-compose and I haven't had to work on my setup at all this year and only about 8 hours in 2020 to setup the Wireguard network to replace the ssh tunnels I was using previously for VPS -> server communications.
At the end of the day YMMV and use what you are comfortable with, but it's not as crazy undertaking as it sounds.
Somewhat OT, but never realized how expensive those cloud instances are. For comparison, I pay $4.95/month (billed annually) for a KVM VPS with 2 Ghz, 2 GB RAM, 40gb SSD, 400 GB HDD in the Netherlands. That seems a lot better for selfhosting where you probably want more raw storage than more SSD space.
If that is enough to handle everything you need, then that is definitely a better deal. The electric bill for a similar home server running 24x7 would be more than $6/mo.
I went down an almost identical path/plan, but then stopped due to corruption concerns with doing the VPS / home sync the way that I wanted without a NAS in the middle managing the thing. It’s still possible, but it explodes the complexity.
One of the big things I wanted to accomplish was low cost and easy to integrate / recover from for family in case of bus-factor.
I didn’t expect to compete with the major cloud providers on cost, but the architecture I was dreaming of just wasn’t quite feasible even though it’s tantalizingly close...basically, all the benefits of a p2p internal network with all the convenience of NextCloud and all the export-ability of “just copy all these files to a new disk or cloud provider.”
It’s so close, there’s just always some bottleneck: home upload is too slow, cold cloud storage too hard to integrate with / cache, architecture requires too much maintenance, or similar.
I think NextCloud is very close for personal use, if only there was a plug and play p2p backend datastore / cache backed by plug and play immutable cold storage that could pick up new entries from the p2p layer.
There is a cryptocurrency called siacoin. It offers cloud storage and there exist a nextcloud plugin for it to integrate it as a storage backend. I have some plans on trying this setup.
What do you think?
You are absolutely right, if you are not familiar with docker-compose, ssh tunnels, wireguard, etc... it will take more time to setup, that being said as far as maintenance go you will probably have a similar experience.
Most of my setup was done through SSH during boring classes in college so I had plenty of time to read documentation and figure out new tools.
After reading through it all, I think this is more a condemnation of the author's diagram (or at least their decision to put that particular one up-front), than of their process in general, nor the challenge.
Breakdown of (my) issues with the diagram:
- author's interaction with each device is explicitly included, adding unnecessary noise
- "partial" and "full" real-time sync are shown as separate processes, whereas there's no obvious need to differentiate them in such a high-level overview
- devices with "partial" and "full" sync (see above) are colour-coded differently; again differentiation unnecessary
- including onsite & off-site backups in the same diagram is cool but would probably be nicer living in a dedicated backup diagram for better focus
Ow wow, I didn't realise Asciiflow has started supporting extended ASCII - I've been using Asciiflow (via asciiflow.com) for years, but haven't used it for a few months, and missed this being introduced!
As much as I love the way HN's design goes against many trending "UX" conventions, I think the long-time refusal to put in very very basic simple fixes like this one is bizarre.
The messed up presentation on mobile is 100% a mobile bug, for which there is a very easy fix on the dev side, and no good workaround on the commenter side.
i've actually daydreamed about starting a computing appliance company that would make a variety of services plug and play for consumers and small businesses, from email to storage, to networking, to security, and to smart home. it's actually the direction apple is headed, but they're encumbered by the innovator's dilemma, which leaves an opportunity for an upstart. google and facebook are similarly too focused on adtech, while amazon on commerce, to lock up this market yet.
I've wanted to make something like this too. After years of iteration, my self hosted setup is now completely automated and the automation itself is super simple and organized. It would be pretty simple to setup a simple web app that allows users to simply apply the same automation steps onto their own VPSs. Hardest part would be setting up a secure process for managing user secrets to be honest.
Business wise, I'm not sure I'd be willing to pay for just the automation... in reality you don't use it very often. Could be interesting to try (re)selling tightly knit VPSs, more advanced automation features or support.
I think this solution still captures the self hosted ideology while also providing some cool value. I see people reinventing the wheel all the time while trying to automate self hosted processes... but then again maybe that's why we do it, we like the adventure!
It would be fun stuff to build, but I feel like you'd struggle to make money. Google and Amazon can afford to give away the hardware, and they can smuggle their ecosystem into your house as a thermostat or a smart speaker or a phone app, or whatever.
Like, how do you persuade the audience of enthusiasts (think: Unifi buyers) to pay for a subscription to managed software they run on their own computers, raspis, whatever? I would probably spend $10/mo on something like that, but much above that and you'd be fighting against the armchair commentary of users who won't appreciate the effort that goes into stability and will basically have a "no wireless, less space than a Nomad, lame" attitude.
Hardware sales. People will pay for the convenience of a device that works out of the box with minimal setup.
On the software side, integrate tightly with your own subscription services (offsite backups, VPS, etc) to upsell to those who want that, and win over the enthusiast crowd by making it possible to host your own alternatives to those services with a little technical know-how.
Open source most components to appeal to enthusiasts, but keep the secret sauce that makes everything seamless and easy to use "source available" so you don't unintentionally turn your core business into a commodity.
Alternatively, it's what Windows Home Server might have become had MSFT kept at it. OTOH, the fact that Microsoft abandoned it might be an indicator of how well such a thing might sell.
You clearly have the upsell part, but where is the "and win over the enthusiast crowd by making it possible to host your own alternatives to those services with a little technical know-how." part?
I and probably many others would be OK with paying for the upsell part, if it's an optional convenience, but nothing I saw on your site indicates it is, or that "own your data" is in any meaningful way true. How do I own my data if any use of it requires me running stuff on your proprietary box, subscribing to your proprietary service?
If you store data on a hard drive you purchased from Best Buy, do you own that data? It's a proprietary box also...
Data from Helm is accessible using IMAP, SMTP, CardDAV, CalDAV, WebDAV on the local network (without requiring our service). You own the device, you own the data. There is a standards-based way of accessing that data just as there is with the hard drive from Best Buy.
> If you store data on a hard drive you purchased from Best Buy, do you own that data?
I can plug the Best Buy harddrive into almost any computer/SAN I want to and utilize for the purpose I bought for without any lock in. Using the hard drive for my data requires very little trust in Best Buy's good intentions at the time of purchase and zero trust in the continued existence, technical competence or good intentions beyond that -- it is very unlikely that best buy will find a way to snoop on my data even if they wanted to. Everything will continue to work fully as intended until mechanical failure sets in. Feels like ownership to me.
In your case I rent some box from you which will lose almost all its intended function the moment your company goes bust or I stop paying you an annual fee. Furthermore it seems I am completely at the mercy of your original and continued good intentions as in addition to using your lock-in for a big price hike later you presumably can also snoop on my data. As far as I can tell there is no really substantial trust differentiator to protonmail. I have to trust them when they claim they won't read my email and are competent enough to keep things secure and I have to trust you (and continue to trust you as long as I want to use the device) that you will encrypt my data and not exfiltrate any of it (or the private key), and furthermore that you run your servers securely enough that no third party will. But what is to stop it? The box is running closed source software that you can remotely update anytime you feel like it, right? I have physical access, but since I don't control the software, what use is that?
Maybe I didn't understand something right, but so far this does not feel like ownership to me.
It sounds more like the worst of all worlds: the lock-in and lack of ownership of a proprietary cloud-based subscription service with the added hassle, inconvenience, downtime and costs of babysitting (and supplying electricity to) a cloud server for you, the provider.
> Data from Helm is accessible using IMAP, SMTP, CardDAV, CalDAV, WebDAV on the local network (without requiring our service).
In what ways is that better than running an IMAP client on my laptop and using it to send data via protonmail, using my own domain, and keeping offline copies of everything (with a periodic upload to backblaze or some other e2e encrypted backup solution)? That seems to offer about the same control over my data, but is cheaper, easier, more convenient, has higher redundancy/uptime and if anything less lock-in. It also doesn't require an additional device that has no use beyond adding an additional failure point and cost center that's my responsibility.
> There is a standards-based way of accessing that data just as there is with the hard drive from Best Buy.
Accessing the data is not enough because what you are selling me is not an overpriced and unergonomic hard drive. You are selling me the ability to send, receive and store email (and likely more).
Don't get me wrong, I kind of like the idea of buying a physical box and a subscription service to self host stuff in a way that gives me better control over my data for an acceptable amount of hassle. But that really requires some amount of openness/auditability and interoperability that currently appears to be absent.
> In your case I rent some box from you which will lose almost all its intended function the moment your company goes bust or I stop paying you an annual fee.
No - we do not rent hardware. When people buy the server from us, they own it. Full stop. There are ongoing costs to make email at home work: a static IP address with good reputation, a security gateway, traffic, etc. If people don't want to pay us for those costs, they will pay them to an ISP and/or an infrastructure provider like AWS. The ease of setup and management comes from the integration of hardware, software and service.
> I am completely at the mercy of your original and continued good intentions as in addition to using your lock-in for a big price hike later you presumably can also snoop on my data.
This is true of any paid service you use right? They can increase your costs at any time. I'm not sure why you think there's something uniquely bad about us for this reason. We have pretty clear values around wanting to know as little about our customers as possible and designing our products end to end around that. We have worked pretty hard at reducing costs, bringing the server price down 60% while doubling its specifications. Our goal is to make this as cost effective and accessible as possible for everyone. We are not interested in locking in customers - it's easy for anyone to take their data off Helm and go to a server of their own making or another service of their choosing. That's not hypothetical - like any company, we have churned customers and supported them in their migration off our product. It's easy to sling these hypotheticals you are concocting but they are not borne out of any reality.
> As far as I can tell there is no really substantial trust differentiator to protonmail.
There is actually a substantial difference. Protonmail holds your data on their servers and therefore can turn it over without a warrant. Well it's encrypted, right? So what could any entity do with that data? Well, Protonmail may be compelled to modify their service to intercept the password on login to decrypt your inbox and turn it over to a government authority (if you don't think that can happen, see what the German government did to Tutanota).
We aren't in a position to do that. Even if the US government came with a court order for your encrypted backups from us, we don't have access to the keys to decrypt them. If we were asked to make firmware changes, we would be retracing the steps of the FBI/Apple San Bernardino case and would enlist the help of the EFF, ACLU and others to fight. I personally believe the case law is pretty clear that they wouldn't win, which is partly why the FBI relented earlier.
> that you can remotely update anytime you feel like it
You make this sound like a terrible thing but really it's not. It allows us to keep our products patched and secured over time.
> In what ways is that better than running an IMAP client on my laptop and using it to send data via protonmail, using my own domain, and keeping offline copies of everything (with a periodic upload to backblaze or some other e2e encrypted backup solution)?
I didn't say people couldn't roll their own solutions. Sure they can - it's just more work, hassle and fragile. And I already covered the tradeoffs of keeping that data in the cloud. Protonmail has access to all your email in the clear (inbound and outbound). We do not and anyone running a server at home would have similar privacy. That's a clear difference.
> Accessing the data is not enough because what you are selling me is not an overpriced and unergonomic hard drive. You are selling me the ability to send, receive and store email (and likely more).
Actually it is because we were talking about data ownership. Your specific dig was about how "own your data" was in any way true ("or that "own your data" is in any meaningful way true" in your parent post).
If you want to self-host email, you need a trustworthy static IP address with reverse DNS. It's considerably more expensive to get this from an ISP. Our annual fee also includes storage for offsite backups. You don't get the same privacy assurances using Protonmail as you do with self-hosting either. For example, Protonmail is privy to the content of all outbound email messages in the clear unless you are communicating with the recipient using E2EE.
From a cost perspective, Helm V2 starts at $199 for 256GB of storage. First year costs, including subscription work out to $298. With Protonmail, their entry level plan with added storage at the same price buys you an inbox with about 28GB, a small fraction of what you would get with Helm storage-wise, not to mention we don't limit users, email addresses, domains, etc.
there are actually tons of companies in this space already making money (e.g., wyze), but it’s highly fragmented and none have a unified vision or product strategy yet. so yes, they’re vulnerable to the behemoths right now, but those dynamics aren’t locked in yet.
it’s mostly tough because of the high upfront capital costs (manufacturing, r&d, and marketing). people still talk fondly about discontinued apple routers and what nest could have been as an independent venture, for example.
Maybe I'm misunderstanding the pitch from the GP, but Wyze seems like it's pretty clearly a hardware + cloud services play, similar to most other IOT ecosystems except maybe Hue. The (optional) monthly cost paid there is for loosened restrictions on an already existing works-anywhere setup— it's an upsell for power users, not a cost of entry.
This seems a lot easier to me than on-prem cloud services, either in BYOH form ("but it's just software") or as a packaged appliance ("another hub to install, really?").
I would say that the closest thing to this right now for paid is coming from the storage side— NAS providers like Synology using hardware sales to support a limited ecosystem of "one click" deployable apps. And for free, it's ecosystems like HomeAssistant, which a lot of people just deploy as a fire-and-forget RPi image, but as expected with a free ecosystem, as soon as you get off the ultra-common use cases, you're reading source code to figure out how it works, and wading through a tangle of unmaintained "community" plugins that only do half of what you want.
the primary value-add is one layer higher than a NAS, a standalone router, or homeassistant but would likely be built on those kinds of things. it's providing a range of hardware devices that can work seemlessly together in a way that you don't have to muck around with config files or programming and yet have it all be secure and private by default. the value is in an ecosystem of safe appliances that require little technical knowledge.
home audio/theater from prior to the internet revolution might be a good analogy: a bunch of separate boxes that each provide tailored functionality but all work together seemlessly without a lot of technical knowledge. that, but for all sorts of computing devices.
> there are actually tons of companies in this space already making money (e.g., wyze), but it’s highly fragmented and none have a unified vision or product strategy yet. so yes, they’re vulnerable to the behemoths right now, but those dynamics aren’t locked in yet.
I also think it's still a little too nerd-focused for the average consumer. I'd say I know far more about security, networking and hardware than the average consumer but, compared to the HN crowd, I know next to nothing. I struggle to use a lot of the current solutions because they get bogged down in doing cool technical stuff that is so far outside the scope of the average potential user's wants/needs or the DIY solution will be "easy"... for someone with an extensive CS background and years of experience.
i think you need to go up a level of abstraction from this perspective to see the utility for the average consumer. we each have computers all around us, phones, tablets, tv's, and increasingly everything else. it's so hopeless to manage, much less understand, these mysterious machines for more and more people. what you want is a company you can trust to manage these things for you but gives you the ultimate, yet cognitively bounded, control over them.
for instance, plug in a smart device and have confidence that it's not doing surreptitious things behind your back, because it's automatically segregated into its own vlan and given only enough network access to be controlled by you without needing to know much about the underlying technologies involved.
It's not "hopeless to manage", learn some networking and be forever rewarded. Same with learning to manage devices, servers, etc. I develop now but I'd be much less valiable without that background.
"a company you can trust to manage these things for you"
"automatically segregated into its own vlan"
Aren't these goals fundamentally at odds? I would imagine that Joe consumer (if they care at all about any of this) would be rather more inclined to entrust the role of orchestrating/segregating their home network devices to an entity like Google than to some random startup.
the average person doesn't even know what these things are, which is partially why there is a market opportunity here. what they know is that companies like google and facebook are not entirely trustworthy but they have no alternative. it's hopelss, until an entity comes along and gives them some hope in the form of an alternative. basically all of the things we talk about around preserving privacy and security on the internet need to be built into our devices, and companies like google actively oppose such limitations of their reach into our lives.
You're getting into Ken Thompson's "Trusting Trust" territory here.
When you lose trust you end up with your crazy uncle leaving Fox News for Alex Jones and YouTube. You have people becoming QAnon followers.
I say this not to make a political point, but that the problem is fundamentally hopeless and I see no way out. You end up landing on one side of the fence or the other. You either just don't think about it and continue to use Google and Facebook and remain ignorant of the problem, or you spiral down the never-ending hole of despair.
We have seen articles recently that tell us not even Signal can be fully trusted. Whether or not it's true is beside the point. The point is, not even the HN crowd is safe from the cliff of paranoia. The seed of doubt has been planted.
Is someone going to trust a small tech startup in 2021? No, not like they would have in 1997. The market for trust has effectively been sealed off today. Because, paradoxically, the Googles and Facebooks ruined it all. They stripped us (all of us, not just HN) of our innocence and naivety. We know not to trust Google, but they are also a known known. A small tech company is a total unknown. We're familiar with how Google is going to bend us over. So if someone is going to do us dirty, it may as well be a known entity. Or... you go and build a cabin in the woods and start writing manifestos.
There are very mundane reasons that you may want to own the software stack your data depends on that aren't 'I don't trust Google/Facebook/Microsoft/The Reptilians.'
The most prominent of which is "What happens when they drop support for my use case/lock me out of my account?"
Unfortunately, the cost of running your own one-off solution is rather high. And doubly unfortunately, while I would pay money for a box that I could plug into my computer that provided all of these services, I wouldn't pay enough money to justify someone building it, and selling it to me.
while i agree google and facebook have certainly peed in the pool, that strikes me as overly cynical, simply through recognizing that only a fringe few actually radicalize or become helplessly paranoid in that way in practice.
most people, whether they rationalize it or not, are cognizant that we live in the grey gradient of trust for various companies and brands. the vector field is all sorts of wacky and inscrutible, but maybe we can point a few of those vectors in the right direction and some folks will happily slide down it to better (but perhaps not perfect) safety and privacy.
Having spent the past year frustratingly trying to build these types of things in AWS and spending too much money with mistakes I'd say there is a huge opportunity here. SMB or NFS as a service for example.
https://www.rsync.net/ has been selling this solution for years. Price competitive these days. Not affiliated, just looked at it recently and thought it was extremely cool.
in the late 90's there was whistle, that partnered with isp's and delivered pretty much this - router, email, web host, storage space, calendar, firewall, and easy to configure with an isp.
From what I hear, it’s all pretty nicely containerized/turnkey already. There are even several “meta apps” (Eg: Homelab OS, YUNoHost, etc) which are like the base layer on which many of these services are available as “applications” which have been pre-configured and can be trivially instantiated.
> The diagram alone is more than enough of an argument to dissuade me from giving this a shot right now - it's simply too complicated and too much to manage for the amount of time I can dedicate to it.
Yeah. I have a basic home server and I feel like even with fairly modest needs/desires (Jellyfin, Deluge, Zoneminder, some kind of file syncing, I gave up on photos because my whole family uses Google for that), it's hard to find a reasonable workflow/setup that covers it all. It was basically down to partitioning by VM (proxmox) or partitioned by container (docker), and I went with Docker + Portainer, but I'm not really happy with it; even basic functionality like redeploying a Compose configuration has sat as a feature-ask for three years [1].
Maybe I'm wanting it to be something that it just isn't, and I'd be happier with microk8s and managing the apps as Helm charts. But is that just inviting additional complexity where none is needed?
I used to have a portainer centric setup.. now I just use docker-compose directly. I have my compose split into different files with a makefile to keep things "make start" simple. Highly recommend.
These complicated setups which we see are complex because they try to save costs by using some part of the cloud. Shared VM resources in the cloud, which is all you really need, are dirt cheap compared to the really simple alternative.
Renting space in a rack at a colo facility and putting an nginx server on it is really simple, but it's also expensive compared to the complex solution in the original post.
Like has been pointed out, this can (and probably should) be done over time.
I have migrated from using cloud provided storage to Nextcloud (been running that for over 4 years now without issues), and have my calendar and contacts in there as well.
My ongoing task is to fully migrate all my images, videos and calibre library from Dropbox to other self hosted entities.
Depends on what you want - for file sync, there is definitely similar things there. As for generic homelab, I'm writing a similar article atm, and I can tell you, it will be maybe simpler to follow, but definitely not less complex. Everything depends on your needs. What's important here that ideally you only need to do this once, and then only do light maintenance.
just use k3s + (restic + velero [backup]) it's soo much easier you can basically install everything with the same tooling and update everything with the same tooling. if something breaks, bam you can just restore the whole cluster with velero (including local volumes)
The good thing (and one of the reasons I like Linux and dotfiles) is that you can start right away and keep sophisticating your setup as you go. You don't lose that configuration which is akin to knowledge.
I bought a Qnap NAS a month ago. I thought I would get it setup right away for my Linux machines, Macbooks, and network. I was wrong. But I'm slowly learning every couple days and now I have a systemd service that loads two volumes using NFS to my Linux machine.
We always have this debate at work. Do we build the system ourselves or do we purchase a product? On prem Prometheus or push everything to DataDog? I'm always a fan of building things myself because I like building things, but my company compares engineering time vs product cost.
I want what you want, yet how can society reward the work involved in creating a turnkey version of such, other than through the standard capitalist selfinterested paradigm?
Ok, I’m SUPER into self hosting, but this article? No way.
1) Duck out isn’t a thing, just stop it.
2) Half the articles cited as examples of corporate abuse were later revealed to be mistakes by the user or easily avoidable pitfalls.
3) Self hosting still requires trust (software you’re running, DNS, domains, ISP, etc...) The line of who to trust and how far is a tough one to answer, even for the informed.
How I solved it:
1) I use well vetted cloud services for things that are difficult/impossible to self host or have a low impact if lost. (Email, domains, github, etc...)
2) I self host things that are absolutely critical with cloud backups. (Files, Photos, code, notes, etc..)
I am perpetually confused about why people think that self-hosting on a VPS solves their privacy and security problems. While I'm sure there are controls in place at reputable VPS providers, it wouldn't be too difficult for them to grab absolutely anything they want. Even disk encryption doesn't save you. You're in a VM, they can watch the memory if they need to.
Using a VPS can also make you more identifiable. Your traffic isn't as easily lost in the noise. The worst thing that I know of people doing is using a VPS for VPN tunneling. While it can have its uses, privacy certainly isn't one of them. You're the only one connecting into it and the only traffic coming out of it.
So I agree with your sentiment, your details are a little off.
“it wouldn't be too difficult for them to grab absolutely anything they want. Even disk encryption doesn't save you. You're in a VM, they can watch the memory if they need to.”
It would be difficult because you’d have to have host access. VM disk encryption is now tied into an HSM or TPM these days, host access wouldn’t help. As for memory, that is now usually encrypted, so no dice there either. The security of a big name public VPS is astoundingly better than what you can do yourself.
“Using a VPS can make you more identifiable”
I think you have a problem of “threat model” here. You’re mixing up hiding against hackers, governments, etc and just lumping it under “privacy and security”
Using a VPS isn’t going to make you more identifiable to google, because you’re not using google now. Using a VPN isn’t going to make you more identifiable to your ISP, because all they can see is that you have a VPN up.
Why not use a VPS for VPN? Well you’re only right it would suck if your threat model includes governments or hostile actors, me hiding from my ISP or on a public Wi-Fi? Not a problem.
You conflate a few ideas and threat models.
Security = The ability to not have your stuff accessed or changed.
Privacy = The ability to not have your stuff seen.
Anonymity = The ability to not have your stuff linked back to you.
Threat model = Who are you protecting yourself from?
E.g. The steps I take to not get hacked by the NSA are going to be different then the steps I use to make comments on 4chan or whatever are different than the steps I take to use public Wi-Fi.
Ref: I work for Amazon AWS, my opinions are my own insane ramblings.
AMD secure memory encryption and secure encrypted virtualization. Intel probably has something in the works, but today you can take a GCE instance from a signed coreboot through bootloader and kernel with logged attestation at each phase resulting in a VM using per-VM disk encryption key (you have to provide it in the RPC that starts the machine; it's supposedly otherwise ephemeral) with SME encrypted RAM (again, ephemeral per-machine key). Google calls it Confidential VM and Secure Boot for now.
> It would be difficult because you’d have to have host access.
Which AWS has, by definition.
> VM disk encryption is now tied into an HSM or TPM these days, host access wouldn’t help.
Are you passing all of the data through the TPM? If no: you still need to keep the key in memory somewhere, the TPM is just used for offline storage. If yes: the TPM, and the communication with it, is still under AWS' control.
> As for memory, that is now usually encrypted, so no dice there either.
Still need to keep the key somewhere, so same concern as for disk encryption. Except I can pretty much guarantee you're not putting the TPM on the memory's critical path, so...
> The security of a big name public VPS is astoundingly better than what you can do yourself.
Feel free to back such claims up in the future. Because right now this seems to be as false as the rest of your post.
> Using a VPS isn’t going to make you more identifiable to google, because you’re not using google now.
What? It certainly won't make you less identifiable either.
> Using a VPN isn’t going to make you more identifiable to your ISP, because all they can see is that you have a VPN up.
Your VPN provider, on the other hand, can now see all of the traffic, where before they couldn't. So the question is ultimately whether you trust your ISP or VPN provider more.
> Why not use a VPS for VPN? Well you’re only right it would suck if your threat model includes governments or hostile actors, me hiding from my ISP
Sure, if you trust the Amazon over your ISP that makes perfect sense. Then again, this is the Amazon that seems to love forcing their employees to piss in bottles, and is on a huge misinformation campaign against treating their employees properly.
That seems like an upstanding place with great leadership.
> or on a public Wi-Fi? Not a problem.
Makes some sense, but it wouldn't really give you much more than hosting the VPN at home. (Well, you'd still have to do the same calculus here for home ISP vs Amazon.)
> You conflate a few ideas and threat models.
Pot, meet kettle.
> Ref: I work for Amazon AWS, my opinions are my own insane ramblings.
Good to know that AWS employees are either clueless about their own offerings, or deliberately spreading misinformation.
Again, with the insults.
I comment for your benefit, not mine. I already know the right answers here, it is you who are mistaken.
So you can either consider an alternative experience set which undoubtedly differs from yours, or not.
I don’t care either way.
VPS doesn't solve privacy and security, it solves getting locked out of your account because some algorithm decided you were peddling child porn.
If you want privacy and security and you don't trust your provider, then you have to build your own hardware and compile everything you run on it from vetted source, including your kernel. You can do it, but most people decide that on balance its better to trust someone.
VPS doesn't solve privacy and security, it solves getting locked out of your account
Does it really? It just seems like instead of trusting a big company that everyone knows, you trust a smaller company that not everyone knows that involves more work for you.
I'm pretty sure I've seen articles on HN where VPS companies (maybe DO?) have kicked people off their infrastructure with zero notice. So, not at all different from being locked out of Apple/Google/Amazon.
Perhaps, depends on your needs, risk model, pricing etc...
I use DO but couldn’t use AWS due to price for example.
Hoping platforms whenever you catch the ire of the gods is a bad CX for this problem space.
As you can see, AWS is far from the only game in town. If you can't find two or three from that list that will meet your needs then perhaps you should reassess your quality metric.
(I note in passing that my preferred provider, Linode, is not even on that list.)
DO = Digital Ocean
Again, depending on your needs, VPS are not a commodity. GCP offers a few things that Azure or AWS don’t, etc...
Often times making sweeping generalizations without deep industry knowledge is a bad idea.
If you don’t even know what DO is, why do you feel experienced enough to argue?
Because I've been self-hosting my internet services on VPS's for the last 15 years on various providers. There are literally dozens of them. They are absolutely a commodity.
"Cloud" and "VPS" are not the same thing. VPS = Virtual Private Server, which is one very specific kind of cloud service. Cloud services in general are not a commodity, but VPS is.
Howso? The VPS can shut you down as well? You might say the migration path is easier, but there will be a weak link somewhere. Even if you put up a datacenter in the basement you need to connect to the internet somehow which can be taken away.
well DO decided to lock me out of my account that I had for years because they decided that I'm a fraud and had to deal with their terrible customer service
With rclone you can encrypt data locally while uploading. This allows you to host everything from home and use the cloud only for backups, basically end-to-end encrypted.
A setup that probably works is vps -> tor -> vpn or some other order of these three, but I couldn't find any sort of blog that detailed setting up something like this so I imagine very few people are doing it.
I always think of it as – how many examples of "I got locked out of all my data!" would there be if billions of people start following the author's advice? Definitely more than the ~5 they list (whether that is user error or actually Apple/Google/Amazon's fault).
the 'duck it out' thing really made me cringe. we really need to get away from that idea of having a searching verb that is tied to the popular search engines of the day. i use duckduckgo but it might not be around in 10 or 20 years or there might be something better by then so its pointless to expect everyone to keep learning new verbs all the time.
These days, I design everything for home with extreme simplicity coupled with detailed documentation on how I set things up.
Docker has helped tremendously, since you can essentially use an out-of-the-box Linux distro with docker installed, and you don't really have to install anything else on the hardware. Then if at all possible, I use standard docker images provided by the software developer with no modifications (maybe some small tweaks in a docker-compose file to map to local resources).
Anyway, my advice is to keep the number of customizations to a bare minimum, minimize the number of moving parts in your home solutions, document everything you do (starting with installing the OS all the way through configuring your applications), capture as much of the configuration as you can in declarative formats (like docker compose files), back up all your data, and just as importantly, back up every single configuration file.