Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Can any Hetzner user please explain your workflow on Hetzner?
113 points by nerdyadventurer on March 22, 2023 | hide | past | favorite | 69 comments
I am thinking of trying out Hetzner for hosting front-ends, back-ends. I have some questions about the workflow on Hetzner.

How do you

- deploy from source repo? Terraform?

- keep software up to date? ex: Postgres, OS

- do load balancing? built-in load balancer?

- handle scaling? Terraform?

- automate backups? ex: databases, storage. Do you use provided backups and snapshots?

- maintain security? built-in firewall and DDoS protection?

If there is any open source automation scripts please share.




Not hetzner but a similar provider:

    - Deploy by stopping the server, rsyncing in the changes, and starting the server. The whole thing is automated by script and takes 5 seconds which is acceptable for us.
    - Run apt upgrade manually biweekly or so.
    - We use client-side load balancing (the client picks an app server at random) but most cloud providers will give you a load balancer IP that transparently does the same thing (not for free though).
    - For scaling just manually rent more servers.
    - For backups we use a cronjob that does the backup and then uploads to MEGA
    - For security we setup a script that runs iptables-restore but this isn't really all that necessary if you don't run anything that listens on the network (except your own server obviously).
    - DDoS is handled transparently by our provider.
While this might change if you're super big and have thousands of servers, in my experience simple is best and "just shell scripts" is the simplest solution to most sysadmin problems.


Ansible can be a big step forward from "just shell scripts". I was on the fence for a while, but it does make things easier quite quickly, even for smaller deployments. Even for just documenting wtf is running on the servers and how.


What I like about Ansible:

  - It provides a standard procedural format for my shell scripts
  - It comes with some level of type-checking when I play a script
  - It makes me actually test that my procedures can bootstrap
  - The playbook style helps me keep scripts organized
  - It lets me start from "You have a server." without questioning where it came from.
  - Ansible, while "bottom-up", lets me bootstrap "top-down" systems like Kubernetes, container registries, etc.


In my experience, Ansible is fantastic for provisioning a machine from bare Linux VM to running service.

OTOH it's quite slow when used for deployments. There's no way you would be getting 5 second deployments with it.

My favorite middle ground between shell scripts and Ansible is Fabric (https://www.fabfile.org/).


Ansible isn't a speed demon, but note that Ansible, like docker over ssh (export DOCKER_HOST=“ssh://user@remotehost) greatly benefits from persisted ssh connections - via ~/.ssh/ssh_config or Ansible config.

https://docs.rackspace.com/blog/speeding-up-ssh-session-crea...

https://www.redhat.com/sysadmin/faster-ansible-playbook-exec...

Apparently Docker also let you point at different hosts via "context":

https://docs.docker.com/engine/context/working-with-contexts


> Ansible can be a big step forward from "just shell scripts".

It can be. It also can not-be. I'd recommend anybody to start looking once they reach more than 2 machines, but yeah, depending on what you are doing it can add value when you have 1 or 2 machines too.


ssh root@hetzner-server-ip "cd my-server && git pull && ./prepare.sh && systemctl restart my.service && journalctl -u my.service -f"

To expand a little bit:

- It's a very small service

- I use sqlite db

- Preparation step before the restart ensures all the deps are downloaded for the new repo state. I.e. "a build step"

- I use simple nginx in front of the web server itself

- Backups are implemented as a cron job that sends my whole db as an email attachment to myself

- journalctl shows how it restarted so I see it's working


So, you have the ssh port (22? custom one?) open to the public on the same machine in which you have your web server/nginx? May I ask why? I had one similar very simple service on Digitalocean once, and my setup was:

- bastion host: custom port for ssh open to the public

- virtual private cloud (vpc): inside I put my web-server and my db server. These servers are not accessible to the public, but the bastion host has access to the vpc

- another server for my nginx. This is public and it redirects requests to my web-server in the vpc

I know it sounds overkill! But somehow it gives me the (illusion?) of being more secure. Am I right with this setup or I'm just wasting my time (and money)? I know perhaps that a VPN could be better... but somehow I found the idea of bastion+vpc quite simple and effective.


Enable certificate authentication only, and turn off password authentication. And you are safe. A million bots attacking a billion times won't break in.

You don't need to put anymore security over it, like changing ssh port or running fail2ban, unless you want to reduce CPU load spent behind handling automated port scanners and bots.


Certificate authentication is overkill and rolling it yourself is painful enough that huge companies have been built around it (Teleport). Unless you're an enterprise SSHer with tons of ephemeral hosts, use public keys instead.


It does not look so painful. I have found this guide helpful [https://jameshfisher.com/2018/03/16/how-to-create-an-ssh-cer...]


I'd guess they just talk about pub key authentication?


I mean, it sounds like in OP's case it's 1 service instance to small N (maybe even 1) potential systems to deploy from. Manual SSH public key deployment doesn't scale to corporate scale, but it certainly scales that far


CA isn't terribly difficult. https://keybase.io/blog/keybase-ssh-ca


Or get a hardware token and add it to the authorized_keys. Depends on how many machines you have to setup.


I'm not sure if a bastion host for a single server is all that much better. If OpenSSH allows access to hackers, they'll break into your bastion host and move on from there. All you're really adding is on e more host to forget to patch.

You're only ever more secure if you reduce the attack surface. These days, with WireGuard's simple and secure tunnels, I'd say a VPN may be an improvement, but I'm not all that worried about SSH on my servers. Either disabling password logins or using secure passwords should be fine in most cases.

I personally change the SSH port as well, not really for security but mostly because it keeps the logs clean. Port scans will still happen but you won't get bombarded by thousands of pi@server.com sessions failing every day of the week.


Hetzner has a remote console tool for "local" terminal access.

I use it to enable/disable sshd during use.


SSH with only public key auth allowed is perfectly safe to have exposed to the internet.


it is still a single factor - and private-key compromises are not unheard of.

(but ssh itself has plenty of ways to harden, not to mention the sk stuff)


I have ssh enabled on all my servers, sometimes port 22 sometimes other ports. I have never had a break in. I use fail2ban but I don't know if that's really necessary. But I use it anyway to secure other services (e.g. wordpress instances) against brute-forcing. It goes without saying that password access and root login should be disabled.

If you want to go to crazy lengths to hide your ssh then do port knocking or something.


I only login with my SSH keys, so I don't see the problem — I'm protected with cryptography.


I use virtually this same setup and we do around 100,000 users per day.


I only use it for side projects right now, and in the past for a real production application for which "high availability" was not a problem (I could do ocasional maintenance windows out of work hours). Here's how I did it in case it helps you:

> deploy from source repo? Terraform?

I use Dokku (https://dokku.com/), then the workflow is the same as if you'd be using Heroku

> keep software up to date? ex: Postgres, OS

Automattic ubuntu updates + I once a week SSH to it and apt-get update, etc.

> do load balancing? built-in load balancer?

I just don't. I don't need for the load of my projects.

> handle scaling? Terraform?

Just vertical scaling for now. A single powerful server can do great before you might need to add more servers.

> automate backups? ex: databases, storage. Do you use provided backups and snapshots?

I just enable the "backup" feature on their admin panel. Adds 20% to the cost but works great and it's easy.

> maintain security? built-in firewall and DDoS protection?

I only expose the HTTP(s) and SSH ports, and I also have setup fail2ban for bruteforce attacks.

> If there is any open source automation scripts please share.

Dokku.


> 50 machines at hetzner

- install machines with ansible (using hetzner scripts for OS install)

- machines communicate over vswitch/vlans, external interfaces disabled whenever possible. Pay attention to the custom mtu trick.

- harden machines, unattended-upgrades mandatory on each machine

- ssh open with IP whitelists from iptables on gateways

- machines organized as k8s clusters, took ~1 year to have everything working cleanly

- everything deployed as k8s resources (kustomize, fluxcd, gitops)

- use keepalived for external IPs with floating IPs for ingress on 3 machines per cluster

Machines are managed as cattle, it takes <1h+ hetzner provisioning time to add as many machines as we need.


> Pay attention to the custom mtu trick.

I wish Hetzner made this more clear up front. Maybe a big red banner on the vSwitch page. I can't count the number of hours I've spent troubleshooting network issues at Hetzner that came down to MTU.


As someone who is trialing Hetzner is there a link to info on this?


The vSwitch docs tell you to limit MTU to 1400 https://docs.hetzner.com/robot/dedicated-server/network/vswi.... If you fail to do this it'll default to 1500 I think and manifest in unpredictable ways. Like being able to fetch updates one moment, then being unable to connect to the update server a few minutes later.


https://github.com/hetznercloud/awesome-hcloud/ collects various devops tools for Hetzner Cloud.


The recent demo of MRSK from 37signals used Hetzner as the first example:

> Introducing MRSK - 37signals way to deploy

Thttps://www.youtube.com/watch?v=LL1cV2FXZ5I


It's not even close to major public cloud providers, but this is my setup:

* https://github.com/kube-hetzner/terraform-hcloud-kube-hetzne... (Terraform, Kubernetes bootstrap)

* Flux for CI

* nginx-ingress + Hetzner Loadbalancer (thanks to https://github.com/hetznercloud/hcloud-cloud-controller-mana...)

* Hetzner storage volumes (thanks to https://github.com/hetznercloud/csi-driver)

Kube-Hetzner supports Hetzner Cloud loadbalancers and volumes out of the box, though it also supports other components.


For my hobby server:

  - Running dokku with Heroku Buildpacks to deploy both from source and to run Docker images behind an ngnix reverse proxy.
  - Autoupgrade apt's, manually updating the OS.
  - No load balancing.
  - No scaling.
  - Automated backups with restic/rclone to OneDrive.
  - Hetzner firewall, no DDoS protection.


Manually provision long-running VMs and manage containers with yacht.sh and that's it really. There's nothing special about Hetzner that makes it qualitatively different from any other cloud provider, except for enterprises features.


- Deploy using docker swarm, CI ssh into machine, pull repo and run

- don't remember the last time I updated lol

- traefik + worker nodes on docker swarm

- again docker swarm

- I have a cronjob that makes backup using postgres, then uploads it to a digitalocean spaces, you can just use S3 as well

- I'm using cloudflare in front of server, but I also use inbuilt firewall as I host a postgres server with hetzner(only allow traffic from the web server worker nodes)


How many servers do you have on this setup? How do you handle persistent storage with docker swarm?


we use https://github.com/costela/docker-volume-hetzner which is really stable.

CSI support for Swarm is in beta as well and already merged in the Hetzner CSI driver (https://github.com/hetznercloud/csi-driver/tree/main/deploy/...). There are some rough edges atm with Docker + CSI so I would stick with docker-volume-hetzner for now for prod usage.

Disclaimer: I contributed to both repos.


> maintain security? built-in firewall and DDoS protection?

I have a Hetzner dedicated server (not the Cloud offering) and I setup OpnSense as an all-in-one routing and firewall solution in a separate VM. All incoming and outgoing traffic goes through this OpnSense VM, which acts as default gateway for the host system and all other VMs/Docker containers. You either need to book a 2nd public IPv4 address (or just use IPv6 for free if that is good enough for your use case, since each server comes with a IPv6 /64 subnet), or if you want to just have 1 IPv4 address you could do some Mac spoofing on the main eth interface of the host OS and give the actual Mac address and public IP to the OpnSense's WAN interface. This is necessary because Hetzner has some Mac address filtering in place, meaning only the Mac address connected to the public IP is allowed to make traffic.


I provision a single VPS that acts as Terraform & Ansible control:

  - Store and run Terraform setup in git
  - Store and distribute SSH keys
  - Store and run Ansible scripts for bootstrapping (e.g. Kubernetes clusters on dedicated, or more VPS'es)
  - Host VPN and some low-intensity services (I'd delegate both of these if I had a bigger budget)
Specifically, this replaces the use of Terraform Cloud.

I enjoyed using Terraform Cloud for a more cloudy setup with easy GitHub pull-request integration at a past employer.

But I'm specifically aiming for simplicity here. It doesn't scale as well to a team of 2+ without establishing conventions.

I haven't explored what self-hosted alternatives there are to Terraform Cloud.


Have you tried using one of the different terraform backends? I usually have mine backed to a gcs bucket so I can run terraform from a CI job and have it maintain state correctly.


I have only experimented, but I haven't settled on anything.

I actually like having my Terraform single source of truth in local git (backed up).

What I'm missing from Terraform Cloud is the `terraform plan` on pull-request submission and `terraform apply` upon merge.

I might do that with ArgoCD. And better CI/CD integration in Forgejo. But that's a long shot still.


I have been working on https://instellar.app to solve this very problem. It allows you to use s3 compatible storage and your compute / database provider. So you can use hetzner or digitalocean or AWS or google cloud, anything you want. For your database you can use digitalocean’s managed / Aiven.io / RDS / Google cloud SQL. This tool brings it all together and enable you simply focus on shipping code.

It does load balancing / automatic ssl issuing out of the box. It will also allow you to scale horizontally. I’m working towards making it public soon.


Github only? Anything else on roadmap?

Understandable for market if not, I will just be disappointed personally!


The focus for now is get it to public beta starting with Github. On the roadmap we need to add Gitlab and Bitbucket for sure.

If there anything else that I've not covered please let me know. Would greatly appreciate the feedback.


Not a Hetzner user, but I believe you can do pretty much anything you can use any other VPS for. I deploy all my stuff on a single server using https://lunni.dev/ (disclaimer: I'm also the author of Lunni). It is a web interface over Docker Swarm with sane defaults for working with web apps.

- Deploy from source repo: Lunni docs guide you how to setup CI building your repo as a docker image, and you can create a webhook that pulls it and redeploys.

- Scaling, load balancing: in theory you can just throw more servers in the swarm, tweak your configuration a bit and it should work. However, I've yet to run past what a single, moderately beefy server can handle :')

- Automate backups: definitely on my roadmap! Right now I'm configuring them manually on critical services, and doing them manually every now and then using the Vackup script.

- Maintain security: Docker's virtual networks acts as a de-facto firewall here. In Lunni, you only expose services you need to the reverse proxy (for HTTP), and if you absolutely must expose some ports directly (e. g. SSH for Git), you have to explicitly list them.

Some other similar alternatives to consider: Dokku, Coolify, Portainer with Traefik / Caddy / nginx-gen. I'll be glad if you choose Lunni though :-) Let me know if you have any questions!


For dedicated servers:

- deploy from source repo? Terraform?

* local build server, which rsyncs to application servers (e.g. files), or through docker registry * scripts to start/stop/restart services * centralised database on which services run on which servers, which serves as base where specific applications run

- keep software up to date? ex: Postgres, OS

ansible for automated installs (through hetzner API) ansible scripts to execute commands on servers (e.g. update software, or adapt firewall when new hosts are being added)

- do load balancing? built-in load balancer? * proxy to route requests to multiple backend servers (e.g nginx) * flexi ip (needs to manually mapped to new server in case of failure over API, so you need to check yourself that the IP is reachable)

- handle scaling? Terraform?

* more servers

- automate backups? ex: databases, storage. Do you use provided backups and snapshots?

* Seperate hdfs cluster, which allows production nodes to write once and read data, but not delete/overwrite any data. * For less data, you could also use their backup servers. * The "backups and snapshots" feature you mention is only available for vservers, not for dedicated servers.

- maintain security? built-in firewall and DDoS protection?

* Hetzner router Firewall * Software firewall (managed through ansible) * Don't use their VLAN feature, as there seems to be often some problems with connectivity (see their forum). * Never had DDos issues

- monitoring of failures: * internal tool to monitor hardware and software issues (e.g. wrongly deployed software, etc...).


I run traefik in docker, and then I run various other random shit including my stepdaugher's Minecraft server in docker.

Every couple of months I remember to pay the bill, then start browsing the auction page, then think "hey that thing isn't much more than I'm paying now, maybe I should upgrade...", but mostly I just stick with things as they are.


It really depends a lot on what you get from Hetzner. Their cloud offerings are kinda weird (few features, high prices), so we buy dedicated servers and run our own containers on top of that.

Deploy from source: Gitlab CI builds and deploys containers

Keep software uptodate: Deploy new containers / migrate all containers from a host to upgrade that with OS tools (Debian for us, so just apt dist-upgrade)

Load balancing: nginx container

Scaling: Hasn't really been an issue for us yet, but terraform/k8s work fine from what I've heard

Backups: Dedicated SX server pulls backups via rsnapshot, including DB dumps. All data is on minutely replicated ZFS pools, so we got short-term snapshots for free anyway.

Security: Still on IPTables and Fail2ban for on-system stuff. DDoS protection from Hetzner itself is okay-ish, but for really critical sites Akamai or Cloudflare are still the safer choices. Both work fine.


We use hetzner cloud with terraform and a self-hosted kubernetes cluster. Everything else is self-baked obviously.


Lots of fancy scripts around. I don't use any, I just configure new servers with an ansible playbook from my laptop, and generally do stuff by hand after that. I don't have anything I'd call "production", just personal and dev stuff. I have a cheap dedicated server that I use as a beataround and for long running computations, and occasionally spin up Hetzner cloud instances for temporary usages. I don't automate backups. I have 5TB of backup space in Hetzner Storage Box (10 euro/month for that!) and manually backup to it with Borg Backup and a few small shell commands in the .bashrc in my ansible script.


    - deploy from source repo? Terraform?
        rsync

    - keep software up to date? ex: Postgres, OS
        apt-get

    - automate backups? ex: databases, storage.
        rsync, pg_dump

    - maintain security? 
        systemd-nspawn


- Applications and DBs run in docker containers. Deploying is basically git pull && docker-compose up --build -d

- Apt auto-upgrades, other software updates are handled in docker. The only software on the machine is haproxy, git and docker for deploys, newrelic and vector for monitoring.

- Haproxy runs on the server to route requests to docker containers. Cloudflare loadbalancing routes to servers.

- Scaling is avoided through over-provisioning cheap Hetzner machines. Adding new machines is done so rarely that a bash script is fine.

- DB backups are done in docker.

- Ufw locks down to ports 22, 80, 443, and DB ports. Because docker can interact with firewalls in surprising ways, I also replicate the rules in the Hetzner filewall.


Migrated from Linode to Hetzner. My workflow has stayed the same:

* Deploying using Git and Capistrano: `git push && cap production deploy` (aliased to cpd)

* Using Hetzner backups + daily backups to Tarsnap using cron

* Updating software by SSH-ing into a server and updating apt packages; I update Ruby gems locally

* For security, built-in firewall + ufw, two-factor authentication, public key-only authentication (SSH key is protected with a password), SSH running on a non-standard port with a non-standard username.

* I use sqlite as a database and caddy as a web server


I am not sure what level of abstraction and automation you are aiming for but there is a pretty neat project for setting up a kubernetes cluster including automatic updates in hetzner [1]. Even if it exceeds your requirements you can scrape it to answer many of your questions.

[1] https://github.com/kube-hetzner/terraform-hcloud-kube-hetzne...


I've been using docker swarm + traefik + portainer and I'm quite happy. I orchestrate everything with Ansible [1]. The only manual process I have is provisioning the servers / load balancers.

It provides a super nice balance between going all manual VPS and going all on the kubernetes cool aid

[1] https://github.com/sergioisidoro/honey-swarm


  - deploy from source repo? Github copy Go binary 
  - keep software up to date? Using Hetzner Cloud + hosted Postgres
  - do load balancing? Hetzner LB + DNSMadeEasy LB failover
  - handle scaling? I don't need to scale fast
  - automate backups? Snapshots + hosted Postgres
  - maintain security? SSH on other port, Hetzner private networks, built-in firewall and DDoS protection


I just ssh 'git pull && ./deploy.sh' which rolls back on deploy error.

traffic: no ddos protection, no load balancing

backups: daily automated backups provided by host. no incrementals.

update: unattended updates, software is tested doesn't break when databases and message queues restart due to unattended upgrades.

security: intact selinux, ufw, proper users and permissions.


Sorry for this blatant self-promotion. If you're looking for managed Kubernetes I'm building https://symbiosis.host which is a service built on top of Hetzner, with support for terraform, load balancers, storage, etc.


Deployment from a bash script that ssh's into the hetzner VPS and git pulls the data and restarts the server.

OS kept up-to-date manually.

No load balancing necessary, it's one server.

No scaling necessary, it's a few thousand users.

Backups: cron with script that s3-compatible copies over to off-site cloud every 6 hours.

Security: firewall yes, DDoS protection no.


I use their instance to run caprover for all my apps. That's basically about it. I use hetzner's backup service, it saved me once recently.

DDoS protection could might be off-loaded to CloudFlare, don't need it personally.

I don't need to scale yet. But I believe caprover is somewhat scaleable.

Security? As others said, SSH keys.


caddy, simple docker compose runtimes with watchtowerrr for updates.

Hetzner is just a bunch of vms, they are all connected over wireguard for ease of use. UFW at the edge for locking down ports.

No DDoS protection, but I can turn it on in cloudflare which I use for DNS.


Ansible for server configuration/changes/deployments

Rundeck to automate/schedule jobs/deployments/upgrades or scale deployments (to fleet of servers)


Question for those who encrypt disk on Hetzner with LUKS. How can you get it to auto assign private IP from DHCP on boot?


hmmm, mainly:

* ansible for CM (first 4 points)

btw. i don't do any deploys from source-repos, either build packages and use your favorite distributions package-mgmt or use containers.

* some shell/awk/perl/python-scripts for backups & security-related stuff :)


It seems you could be interested in a fully managed service over Hetzner handling security/firewall/monitoring/alerts/backups but also OS / Software updates and CI/CD pipelines from your repos

Please check: https://elest.io

Disclaimer: I'm CTO & founder


I’ve been working on a new fully automated setup with 1 click.

Right now I provision my nodes automatically with Terraform. I use cloud init scripts during machine initialization and an adhoc remote provisioner for some firewall stuff and config updates after complete.

This is for boot. For configuration management I’m working on getting my Saltstack complete and easy to use.

Saltstack can be used like chef/ansible but it’s much more intuitive to me and very flexible. This is for automating and managing package installation on my nodes, firewall rules, grouping nodes by config, etc.

What’s also cool with salt is you can have it make changes based on a web hook (salt reactor), ie merging commit into master.

My plan is to basically version control everything into salt so things like VPN setup, software, alerts are all automatically setup. I would love to extend this to also manage a NAS with automated backups.

Tl;Dr I am migrating my flow to be automated deployments from GitHub using Ansible to automated provisioning and deployments using Salt, Terraform and GitHub/Gitea


Are we talking about their cloud services or dedicated servers? I (and a couple of clients) use their dedicated servers, the procedure is the same as with any bare metal hosting. Here's the setup of my own servers (one at Falkenstein data center and one at Helsinki). My use case is small apps, with a couple hundred concurrent users at most. If you need a more dramatic infrastructure that scales up automatically and auto-deploys software left and right, that's a whole different ball game.

- Proxmox as the base OS, stock install. Close every port except SSH, 80, 443 (alternatively you may want to go with Wireguard instead of SSH). There is an nginx instance running in front of the containers, it passes data along to them as per config. Otherwise, nothing is reachable from the outside.

- Servers are on Proxmox containers, mostly also Nginx, some Nodejs, some other, you know the drill. The containers are pretty low overhead, so you can implement basically any deployment strategy in that environment. They're also easy to back up and to replicate to other machines.

> keep software up to date? ex: Postgres, OS

I run a periodic "apt update && apt upgrade -y && apt autoremove -y" as a cron job on most containers. Some configurations tend to break occasionally, so I do those specific ones manually or with additional scripts. I have a repo of scripts and snippets that I use everywhere, just little hacks that accumulated over the years because they automate useful things.

> do load balancing? built-in load balancer?

That depends on where your loads are, and what the structural needs of your applications are. If this is about external web requests to a mostly read-heavy application, I highly suggest using a CDN such as Cloudflare rather than rolling your own. That being said, Nginx makes load balancing pretty painless.

> automate backups? ex: databases, storage. Do you use provided backups and snapshots?

Their storage offering is pretty okay, but I would consider restoring a whole-system backup a last resort. Proxmox has built-in support for container snapshots/backups, which gives you more granular control. These snapshots are also easy to rsync periodically to another host. If the physical machine dies, you just start the container on another host from a recent backup. There are HA options for this on Proxmox if you link more than one host into a cluster (which is overkill for most setups).

> maintain security? built-in firewall and DDoS protection?

Close down your ports. No complicated firewall rules, either. Just block anything that isn't directed at one of your 3 necessary ports. With DDoS protection: don't roll your own, use a CDN. Also, install only things you can audit or come from a reasonably safe source. For instance, I would highly discourage running npm installs/updates unsupervised. If you have a production app that needs to work and needs to be reasonably secure, don't automatically pull data from free-for-all package managers - deploy them with reviewed or known-good versions hard locked (or deploy them with dependencies already included).

As a final tip: Hetzner servers come with RAID setups (usually RAID1). Monitor the status of those drives! If one fails, tell them to replace it. They will usually do it within the hour on a running system.


We use Docker Swarm for our deployments, so I will answer the questions based on that.

We have built some tooling around setting up and maintaining the swarm using ansible [0]. We also added some Hetzner flavour to that [1] which allows us to automatically spin up completely new clusters in a really short amount of time.

deploy from source repo:

- We use Azure DevOps pipelines that automate deployments based on environment configs living in an encrypted state in Git repos. We use [2] and [3] to make it easier to organize the deployments using `docker stack deploy` under the hood.

keep software up to date:

- We are currently looking into CVE scanners that export into prometheus to give us an idea of what we should update

load balancing:

- depending on the project, Hetzner LB or Cloudflare

handle scaling:

- manually, but i would love to build some autoscaler for swarm that interacts with our tooling [0] and [1]

automate backups:

- docker swarm cronjobs either via jobs with restart condition and a delay or [4]

maintain security:

- Hetzner LB is front facing. Communication is done via encrypted networks inside Hetzner private cloud networks

- [0] https://github.com/neuroforgede/swarmsible

- [1] https://github.com/neuroforgede/swarmsible-hetzner

- [2] https://github.com/neuroforgede/nothelm.py

- [3] https://github.com/neuroforgede/docker-stack-deploy

===================

EDIT - about storage:

We use cloud volumes.

For drivers:

We use https://github.com/costela/docker-volume-hetzner which is really stable.

CSI support for Swarm is in beta as well and already merged in the Hetzner CSI driver (https://github.com/hetznercloud/csi-driver/tree/main/deploy/...). There are some rough edges atm with Docker + CSI so I would stick with docker-volume-hetzner for now for prod usage.

Disclaimer: I contributed to both repos.


I use Hetzner, Contabo, Time4VPS and other platforms in pretty much the same way (as IaaS VPS providers on top of which I run software, as opposed to SaaS/PaaS), but here's a quick glance at how I do things, with mostly cloud agnostic software.

> deploy from source repo? Terraform?

Personally, I use Gitea for my repos and Drone CI for CI/CD.

Gitea: https://gitea.io/en-us/

Drone CI: https://www.drone.io/

Some might prefer Woodpecker due to licensing: https://woodpecker-ci.org/ but honestly most solutions out there are okay, even Jenkins.

Then I have some sort of a container cluster on the servers, so I can easily deploy things: I still like Docker Swarm (projects like CapRover might be nice to look at as well), though many might enjoy the likes of K3s or K0s more (lightweight Kubernetes clusters).

Docker Swarm: https://docs.docker.com/engine/swarm/ (uses the Compose spec for manifests)

K3s: https://k3s.io/

K0s: https://k0sproject.io/ though MicroK8s and others are also okay.

I also like having something like Portainer to have a GUI to manage the clusters: https://www.portainer.io/ for Kubernetes Rancher might offer more features, but will have a higher footprint

It even supports webhooks, so I can do a POST request at the end of a CI run and the cluster will automatically pull and launch the latest tagged version of my apps: https://docs.portainer.io/user/docker/services/webhooks

> keep software up to date? ex: Postgres, OS

I build my own base container images and rebuild them (with recent package versions) on a regular basis, which is automatically scheduled: https://blog.kronis.dev/articles/using-ubuntu-as-the-base-fo...

Drone CI makes this easy to have happen in the background, as long as I don't update across major versions, or Maven decides to release a new version and remove their old version .tar.gz archives from the downloads site for some reason, breaking my builds and making me update the URL: https://docs.drone.io/cron/

Some images like databases etc. I just proxy to my Nexus instance, version upgrades are relatively painless most of the time, at least as long as I've set up the persistent data directories correctly.

> do load balancing? built-in load balancer?

This is a bit more tricky. I use Apache2 with mod_md to get Let's Encrypt certificates and Docker Swarm networking for directing the incoming traffic across the services: https://blog.kronis.dev/tutorials/how-and-why-to-use-apache-...

Some might prefer Caddy, which is another great web server with automatic HTTPS: https://caddyserver.com/ but the Apache modules do pretty much everything I need and the performance has never actually been too bad for my needs. Up until now, applications themselves have always been the bottleneck, actually working on a blog post about comparing some web servers in real world circumstances.

However, making things a bit more failure resilient might involve just paying Hetzner (in this case) to give you a load balancer: https://www.hetzner.com/cloud/load-balancer which will make everything less painless once you need to scale.

Why? Because doing round robin DNS with the ACME certificate directory accessible and synchronized across multiple servers is a nuisance, although servers like Caddy attempt to get this working: https://caddyserver.com/docs/automatic-https#storage You could also get DNS-01 challenges working, but that needs even more work and integration with setting up TXT records. Even if you have multiple servers for resiliency, not all clients would try all of the IP addresses if one of the servers is down, although browsers should: https://webmasters.stackexchange.com/a/12704

So if you care about HTTPS certificates and want to do it yourself with multiple servers having the same hostname, you'll either need to get DNS-01 working, do some messing around with shared directories (which may or may not actually work), or will just need to get a regular commercial cert that you'd manually propagate to all of the web servers.

From there on out it should be a regular reverse proxy setup, in my case Docker Swarm takes care of the service discovery (hostnames that I can access).

> handle scaling? Terraform?

None, I manually provision how many nodes I need, mostly because I'm too broke to hand over my wallet to automation.

They have an API that you or someone else could probably hook up: https://docs.hetzner.cloud/

> automate backups? ex: databases, storage. Do you use provided backups and snapshots?

I use bind mounts for all of my containers for persistent storage, so the data is accessible on the host directly.

Then I use something like BackupPC to connect to those servers (SSH/rsync) and pull data to my own backup node, which then compresses and deduplicates the data: https://backuppc.github.io/backuppc/

It was a pain to setup, but it works really well and has saved my hide dozens of times. Some might enjoy Bacula more: https://www.bacula.org/

> maintain security? built-in firewall and DDoS protection?

I personally use Apache2 with ModSecurity and the OWASP ruleset, to act as a lightweight WAF: https://owasp.org/www-project-modsecurity-core-rule-set/

You might want to just cave in and go with Cloudflare for the most part, though: https://www.cloudflare.com/waf/


> Some might prefer Caddy, which is another great web server with automatic HTTPS: https://caddyserver.com/ but the Apache modules do pretty much everything I need and the performance has never actually been too bad for my needs. Up until now, applications themselves have always been the bottleneck, actually working on a blog post about comparing some web servers in real world circumstances.

For some reason Apache gets bad rap for being old and slow while in reality it's still a pretty damn good at what it does. I worked at hosting provider that used Apache on all of their servers and I have never had any doubts that Apache is more than enough for all the things I might ever want to do with it. Sure, it doesn't serve up Markdown files as Caddy does, but as for performance then Apache itself has never been a bottleneck either. It's always the application or the database, never Apache.


> For some reason Apache gets bad rap for being old and slow while in reality it's still a pretty damn good at what it does.

There's a few actual reasons for why this might be the case, because in some configurations Apache can indeed be somewhat slow.

.htaccess: if you don't disable this mechanism, the web server might do a bit too much I/O, their docs describe it all nicely and it's good to pay attention to this, because you won't always need .htaccess in the first place https://httpd.apache.org/docs/current/howto/htaccess.html

mod_php: some don't bother setting up PHP-FPM properly (assuming that you want to run PHP apps with Apache) and instead reach for the legacy module which has considerably worse performance than the alternative https://cwiki.apache.org/confluence/display/HTTPD/PHP

Once I get some more motivation, I'll do a real world comparison of Apache, Nginx and probably Caddy as well in a variety of workloads. My intuition tells me that Apache will still be slower, but not to a degree where it would be a non-starter for the majority of the projects out there.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: