I put my original mainboard in one of these when I upgraded. It's fantastic. I had it VESA-mounted to the back of a monitor for a while which made a great desktop PC. Now I use it as an HTPC.
Those are pretty cool. I meant to highlight more, that the laptop has done super well. I can't even tell it's on as I hear no fan / no heat. I guess laptops are pretty good for this as they are great at sipping power when there is a low load.
Back in 2012 or so, I reused an old netbook (an Asus Eee PC) with an Atom CPU & 1GB of RAM, installed Ubuntu Server, and used it as a home server. It handled the printer, DNS-VPN proxying for streaming, and a few other things admirably for years. (And ironically was resilient to Spectre because its Atom CPU was before Intel added speculative execution)
Eventually, the thing that kicked the bucket was actually the keyboard (and later the fan started making "my car won't start" noises occasionally). Even the horribly-slow HDD (that handled Ubuntu Server surprisingly well) hadn't died yet.
A fair few things want blob object storage like S3. NFS does not scale to ridiculous levels horizontally or vertically. S3 does things like de-duplication and other funky tricks.
So if you want to use an app that needs S3 then you need to deploy S3 and not NFS.
I run a minio cluster (S3) for Veeam backups at work. I also run multiple NFS for Veeam and VMware datastores.
I'd rather go with an old Dell T30 and 2x10TB Seagate Exos in ZFS RAID1 mode (Mirror). This thing would make me nervous every day, even with a daily backup in place... While the Dell T30 would also make me nervous, you could at least plug the disks into any other device and are not wiring up everything with some easy to pull out cables ;)
However, garage sounds nice :-) Thanks for posting.
I've been using ZFS for quite a while, and I had a realization some time ago that for a lot of data, I could tolerate a few hours worth of loss.
So instead of a mirror, I've set up two separate one-disk pools, with automatic snapshots of the primary pool every few hours, which are then zfs send/recv to the other pool.
This gives me a lot more flexibility in terms of the disks involved, one could be SSD other spinning rust for example, at the cost of some read speed and potential uptime.
Depending on your needs, you could even have the other disk external, and only connect it every few days.
I also have another mirrored RAID pool for more precious data. However almost all articles on ZFS focus on the RAID aspect, while few talk about the less hardware demanding setup described above.
1.) A mirror with an attached Tasmota Power Plug that I can turn on and off via curl to spin up an USB-Backup-HD:
curl "$TASMOTA_HOST/cm?cmnd=POWER+ON"
# preparation and pool imports
# ...
# clone the active pool onto usb pool
zfs send --raw -RI "$BACKUP_FROM_SNAPSHOT" "$BACKUP_UNTIL_SNAPSHOT" | pv | zfs recv -Fdu "$DST_POOL"
To prevent partial data loss I use zfs-auto-snapshot, zrepl or sanoid, which I configure to snapshot every 15 minutes and keep daily, weekly, montly and yearly snapshots as long as possible.
To clean up my space when having too many snapshots, I wrote my own zfs-tool (https://github.com/sandreas/zfs-tool), where you can do something like this:
That's a really cool idea and matches my use case well. I just copy pasted it to another person in this thread who was asking about the ZFS setup.
Your use case perfectly matches mine in that I wouldn't mind much about a few hours of data loss.
I guess the one issue is that it would require more disks, which at the current prices is not cheap. I was suprised how expensive it was when I bought them 6 months ago and was even more suprised when I looked recently and the same drives are even more now.
That sounds cool; is it possible to just query the ZFS system to know when it has finished synchronizing the slow disk, before bringing it offline again? Do you think that stopping and spinning the disk again, 24 times a day, is not going to cause much wear to the motors?
I’ve not heard of garage before but it looks quite interesting. I use s3 a lot for work but for homelab backups I’ve always just used borg on borgbase. Now I’m wondering whether I could use garage to pair a local node and AWS glacier for cheap redundancy of a large media library (I’m assuming that ~all of the reading is automatically done from the local node). TFA doesn’t really talk much about the actual experience of using garage - would love to hear more opinions from those who use it for self-hosting.
Edit: Realised you can’t use glacier since storage has to be mounted to the ec2 compute running the garage binary as a filesystem. So doesn’t really make sense as media library backup over just scheduling a periodic borg / restic backup to glacier directly.
I haven't needed to interact with Garage itself specifically. I've been using Boto3 / awscli / s3cmd / rclone for everything S3 API related and it's worked great. Garage was a few commands to setup, turn on, get API keys setup, and then left to run on it's own for the past 4 months.
Nice to see Garage mentioned. I was deciding between S3-compatible self-hosted alternatives and ended up choosing SeaweedFS. It seems to require less manual configuration compared to Garage
I'd like more ellaboration on the technical side. Not literally how to do the same and what commands to use, but more in the line of how are the ZFS pools configured, or if Garage is opinionated and configures it all by itself. Are there mirrors in there? Or it's just individual pools that sync from some disks to others?
I have 2 USB disks and want to make a cheapo NAS but I always doubt between making a ZFS mirror, making 2 independent pools and use one to backup the other, or just go the alternate route and use SnapRAID and then be able to mix more older HDDs for maximum usage of the hardware I already own.
My understanding is that Garage is not opinionated and could easily have worked without ZFS. I installed ZFS in Ubuntu, and then later installed Garage.
As for the ZFS setup, I kept it simple and did RAID5/raidz1. I'm no expert in that, and have been starting to think about it again as the pool approaches 33% full.
I saw this comment in another thread here that sounded interesting as well by magicalhippo:
"I've been using ZFS for quite a while, and I had a realization some time ago that for a lot of data, I could tolerate a few hours worth of loss. So instead of a mirror, I've set up two separate one-disk pools, with automatic snapshots of the primary pool every few hours, which are then zfs send/recv to the other pool."
This caught my attention as it matches my usecase well. My original idea was that RAID5 would be good incase a HD fails, and that I would replicate the setup at another location, but the overall costs (~$1k USD) are enough that I haven't done that yet.
If you know where to look/are a little lucky, you can get an adequate RAID5 going for like $500-800 depending on the storage you need. I grabbed a QNAP 4 bay (no SSD caching) and 4x refurbished enterprise HDD's (14tb/ea) for just under $700 all-in last november if memory serves. Pretty reasonable for a 42tb RAID5 IMO.
It's weird to me that "owning a computer that runs stuff" is now "self-hosting", just feels like an odd phrasing. Like there's an assumption that all computers belong to someone else now, so we have to specify that we're using our own.
It’s not clear from the blog post if the S3 is accessible from outside their home. I agree with the parent that purely local services aren’t what typically counts as “self-hosting”.
Let's not kid ourselves that maintaining 10TB with resiliency handling and other controls built in is something that is trivial. It is only trivial due to the offerings that Cloud computing has made easy.
Self-hosting implies those features without the cloud element and not just buying a computer.
10tb fits on one disk though - it may not be trivial but it's not overly complicated setting up a raid-1. Off-site redundancy and backup of course does make it more complicated however.
You can buy a 10TB+ external drive which uses RAID1.
You can also buy a computer with this — not a laptop, and I don't know about budget desktops, but on Dell's site (for example) it's just a drop-down selection box.
Very cool! I replaced my mainboard on my framework and am trying to convert it to a backup for my nas.
Could you talk a little more about your zfs setup? I literally just want it to be a place to send snapshots but I’m worried about the usb connection speed and the accidentally unplugging it and losing data
ZFS is RAM hungry, plus doesn't like USB connections (like the article implied). So, I've been eyeing btrfs as a way to setup my NAS drives. Would I miss something in that setup?
Getting into S3 myself and really curious about what Garage has to offer vs the more mature alternatives like Minio. From what I gather, it kinda works better with small (a few kilobytes) files or something?
I loved minio until they silently removed 99% of the admin UI to push users towards the paid offering. It just disappeared one day after fetching the new minio images. The only evidence of the change online was discussions by confused users in the GitHub issues
I have also been considering this for some time. Been comparing MinIO, Garage, and Ceph. MinIO may not be wise given their recent moves, as another commenter noted. Garage seems ok but their git doesn’t show much activity these days so I wonder if it too will be abandoned. Which leaves us with Ceph. May have a higher learning curve but also offers the most flexibility as one can do object as well as block and file. Gonna set up a single node with 9 OSD’s soon and give it a go but always looking for input if anyone would like to provide some.
If I can reassure you about Garage, it's not at all abandoned. We have active work going on to make a GUI for cluster administration, and we have applied for a new round of funding for more low-level work on performance, which should keep us going for the next year or so. Expect some more activity in the near future.
I manage several Garage clusters and will keep maintaining the software to keep these clusters running. But concerning the "low level of activity in the git repo": we originally built Garage for some specific needs, and it fits these needs quite well in its current form. So I'd argue that "low activity" doesn't mean it's not reliable, in fact it's the contrary: low activity means that it works well for us and there isn't a need to change anything.
Of course implementing new features is another deal, I personally have only limited time to spend on implementing features that I don't need myself. But we would always welcome outside contributions of new features from people with specific needs.
I appreciate the response! Thanks for the update. I will continue keeping an eye on the project then and possibly giving it a try. I have read the docs and was considering setting it up across two sites. The implementation seemed address this pain point with distributed storage solutions and latency.
I've used Ceph in a home lab setting for 9 years or so now. Since cephadm is has gotten even easier to manage even though it really was never that hard. A few pointers. No SMR drives, they have such bad performance that they can periodically drop out of the cluster. Second, no consumer SSDs/NVMe devices. You need power loss prevention on your drives. Ceph directly writes to the drive, it ignores cache, without PLP you may literally have slower performance than rust.
You also want fast networking, I just use 10Gbps. My nodes each are 6 rust and 1 NVMe drive each, 5 nodes. I colocate my MONs and MDS daemons with my OSDs, each node has 64GB of RAM and I use around 40GB.
Usage is RDB for a three node OpenStack cluster, and CephFS. I have about 424TiB between rust and NVMe raw.
I have an ancient Qnap NAS (2015) which is on borrowed time and I’m trying to figure out what to replace it with. Keep going back and forth between rolling my own with a Jonsbo case vs. a prebuilt like the new Ubiquti boxes. This is an attractive third option of a modest compute box (raspy, NUC, etc.) paired with a JBOD over USB. Can you still use something like TrueNAS with a setup like that?
Neat. Depending on your use case it might make sense.
Still I wonder what they use for backup? For many use cases downtime is acceptable, but data loss is generally not. Did I miss it in the post?
OP here. There I currently have some things syncd to a cloud S3. The long term plan would be to replicate the setup at another location to take advantage of garage region/nodes, but need to wait for the money for that.
Yeah, this was an effort to get around cloud costs for large amounts of 'low value' data that I have but use in my other home servers for processing. I still sync some smaller result sets to an S3 in the cloud for redundancy as well as for CDN uses.
Why are you calling it S3? That is a proprietary Amazon cloud technology. Why not call it what is it is, e.g. ZFS, file store, or object store? Let's not dilute terms.
That's a good point, it is S3 compatible object storage, not just S3. My experience with AWS S3 has impacted the way I use object storage and since this project is syncd to another S3 compatible object storage using the S3 protocol, in my head I just call it all S3.
Okay, weird to call it S3, if it is just object storage somewhere else. Its like saying "EKS" if you mean Kubernetes, or talking about "self hosting EC2" by installing qemu.
AWS S3 was the first S3-compatible API provider, nowadays most cloud providers and bunch of self hosted software supports S3(-Compatible) APIs. Call it Object Store (which is a bit unspecific) or call it S3-Compatible.
EKS and EC2 on the other hand are a set of tools and services, operated by AWS for you - with some APIs surrounding them that are not replicated by any other party (at least for production use).
https://www.coolermaster.com/en-global/products/framework/
reply