
71 TiB DIY NAS Based on ZFS on Linux - ors
http://louwrentius.com/74tb-diy-nas-based-on-zfs-on-linux.html
======
KaiserPro
I know I'm going to suffer hard for this comment. However I'd suggest reading
the ZFS optimisation guide.

I know the allure of raid 7 is strong, however if a disk were to fail on a LUM
thats 24 disks wide, the chances of a rebuild is pretty low. (I know you've
since changed that) the rebuild time is around 60-90 hours with no load.

(we use 14 disk raid6 LUNs for the balance of performance vs safety, however
we've come pretty close to loosing it all.)

You'll get better performance if you have many smaller vdevs. You'll also
compartmentalize the risk of multiple disk failure.

a raid 0+6 of four LUNs of 5 disks will have much greater performance, and
will rebuild much faster, without risking too much. However you will loose
more space

~~~
rancor
Nope, you're right on the money. My first thought was "this guy needs more
VDEVs". Given the hardware and use case, 4x 6 disk RAIDZ2 VDEVs would make a
ton more sense, and could also be cross-cabled to allow a controller failure.

~~~
storrgie
Smaller VDEVs make it a bit easier to increase size if you're rolling your own
system. Buying 6 more drives instead of 8 or 10 is a smaller cost.

------
kogir
ZFS is pretty great, but it's not perfect. Occasionally you have to dump and
re-create a whole pool to fix corruption:

    
    
        [nsivo@news.ycombinator.com ~]$ zpool status -v
          pool: arc
         state: ONLINE
          scan: scrub repaired 0 in 0h43m with 0 errors on Wed Aug 13 20:14:53 2014
        config:
        
            NAME          STATE     READ WRITE CKSUM
            arc           ONLINE       0     0     0
              mirror-0    ONLINE       0     0     0
                gpt/arc0  ONLINE       0     0     0
                gpt/arc1  ONLINE       0     0     0
        
        errors: No known data errors
        
          pool: tank
         state: ONLINE
        status: One or more devices has experienced an error resulting in data
            corruption.  Applications may be affected.
        action: Restore the file in question if possible.  Otherwise restore the
            entire pool from backup.
           see: http://illumos.org/msg/ZFS-8000-8A
          scan: scrub repaired 0 in 5h3m with 0 errors on Tue Aug 26 00:19:05 2014
        config:
        
            NAME           STATE     READ WRITE CKSUM
            tank           ONLINE       0     0     0
              raidz2-0     ONLINE       0     0     0
                gpt/tank0  ONLINE       0     0     0
                gpt/tank1  ONLINE       0     0     0
                gpt/tank2  ONLINE       0     0     0
                gpt/tank3  ONLINE       0     0     0
            logs
              mirror-1     ONLINE       0     0     0
                aacd5p1    ONLINE       0     0     0
                aacd6p1    ONLINE       0     0     0
        
        errors: Permanent errors have been detected in the following files:
        
                /usr/nginx/var/log/nginx-access-appengine.log.0
    

I can't delete that file or recover it from backup, and don't really trust the
pool or our disk controller anymore. So we're moving to a new server, again.
Good times.

I imagine this would be disastrous for someone who's filled a 24 disk array
with data, and has to purchase a duplicate to safely restore to.

~~~
dsturnbull2049
This is one of the key strengths of ZFS. Your log file would be _silently
corrupted_ without checksumming.

~~~
rodgerd
Regardless, it's still frustrating that so many failure modes for ZFS have an
answer of "blow it all away and start again". And that's with Solaris/ZFS.

~~~
laumars
There are other ways to fix issues with ZFS pools without having to " _blow it
all away and start again_ ". In the example that started this conversation
fork, there's a simple parameter to pass to the zpool utility that would have
saved the OP from dropping his pool:

    
    
        zpool clear tank
    

ZFS does actually come with a decent set of tools for fixing and clearing
faults - which I'm starting to realise isn't as widely known (possibly because
people rarely run into a situation where they need it?)

Anecdotally speaking; I've ran the same pool for years on consumer hardware
and that pool has even survived a failing SATA controller that was randomly
dropping disks during heavy IO and causing kernel panics. So I've lost count
of the number times ZFS has saved my storage pool from complete data loss
where other file systems would have just given up.

------
mrb
This guy would have more predictible performance and higher reliability if he
had organized the array as two 12-drive raidz2 vdevs (instead of 18-drive and
6-drive). The reasons are because (1) the chances of triple drive failure
though low is higher in a 18-drive vdev than in a 12-drive one, and (2)
performance (IOPS) is variable between two asymmetrical vdevs.

There is very rarely a good reason for doing asymmetrical vdevs in a single
ZFS pool.

I approve of his choice of HBA (based on the SAS2008 chip). For those looking
for even bigger setups, I maintain a list of controllers at
[http://blog.zorinaq.com/?e=10](http://blog.zorinaq.com/?e=10) where you can
even find 16-, 24- and 32-port controllers. 8-port ones are the most cost
effective as of today.

I have been a happy ZFS user for 9 years, both personally and professionally.
Using snapshots, running periodical scrubs, having dealt with a dozen of drive
failures over the years. Everything works beautifully - it's a pleasure to
admin ZFS.

~~~
voltagex_
Any hints on building a ~16TB array? Drive size, RAM size?

I've been eying this -
[http://www.supermicro.com/products/motherboard/Atom/X10/A1SA...](http://www.supermicro.com/products/motherboard/Atom/X10/A1SAi-2750F.cfm),
but the project is firmly in the "one day when I have a spare $5k" basket.

------
chubot
Echoing one of the comments -- why Linux? I thought most open source ZFS
configurations used FreeBSD. (I know there are the open source Solaris
variants, but I would imagine most customers still pay for support.)

ZFS is production ready on Linux now?

~~~
snw
There is indeed a great and active open source ecosystem of Illumos (ex open-
solaris) distributions.

Illumos is the de-facto upstream of open-zfs and has many features not found
in oracle solaris. [1]

So it is well worth checking out at least one of those:

\- SmartOS, which is a fantastic Cloud-Hypervisor [2]

\- OmniOS, a classic server distribution, probably best for what OP is
building. [3]

\- NexentaStor, a storage appliance - but i have not tried it yet [4]

I can definitely recommend using ZFS on a Illumos system as it has much better
integration with things like kstats and fault management architecture that I
would miss on any other OS.

[1] [http://open-zfs.org/wiki/Features](http://open-zfs.org/wiki/Features)

[2] [http://smartos.org](http://smartos.org)

[3] [http://omnios.omniti.com](http://omnios.omniti.com)

[4] [http://www.nexentastor.org](http://www.nexentastor.org)

~~~
_delirium
I notice you didn't mention OpenIndiana. Is OmniOS now recommended over
OpenIndiana for the classic-server-install case?

~~~
radiowave
snw's post might be reflecting the fact that OpenIndiana isn't seeing a lot of
updates (except perhaps for those that come directly from Illumos.) For
example, when the heartbleed bug came to light, it turned out that OpenIndiana
wasn't vulnerable, because the version of ssh they had was so old it had never
had the vulnerability introduced into it in the first place.

As yet I don't have any hands-on experience with OmniOS, but I'd tend to agree
that it seems the best direction to go looking in, for a conventional
OpenSolaris-derived OS.

~~~
snw
this is indeed the biggest reason. They have a "hipster"-branch [1] that is
getting some updates but I don't follow that closely. Another reason is that
they try to target the "developer desktop" which is not that interesting to me
personally and also probably a lot more work (driver wise) than they might
have the manpower for...

OmniOS on the other hand is pretty solid and with pkgsrc (default package
system on SmartOS and NetBSD) you get nearly 13k packages of open-source
software.

My List is also missing a few other commercial players that use and contribute
to Illumos:

\- Delphix, uses ZFS snapshots to do fancy things with big databases

\- Pluribus has "Netvisor OS" that is some SDN Router/Networking appliance [3]

And then there is the long tail of small hobby distributions:

\- Tribblix, "Retro with modern features" [4]

\- DilOS, which has a focus on Xen and Sparc [5]

and probably more that I forgot and don't know.

It is a very nice community and lots of interesting technology gets developed.
Our company migrated nearly all servers from Linux to SmartOS in the last 2
years and we could not be happier.

[1]
[http://wiki.openindiana.org/oi/Hipster](http://wiki.openindiana.org/oi/Hipster)

[2] [http://pkgsrc.joyent.com/installing.html#installing-on-
illum...](http://pkgsrc.joyent.com/installing.html#installing-on-illumos)

[3] [http://www.pluribusnetworks.com/products/netvisor-
os/](http://www.pluribusnetworks.com/products/netvisor-os/)

[4] [http://www.tribblix.org](http://www.tribblix.org)

[5] [http://www.dilos.org/about-dilos](http://www.dilos.org/about-dilos)

~~~
_delirium
Thanks, that's helpful! I'm mainly experienced with Debian, but I've been
exploring whether Debian-for-everything is still the best policy. In terms of
actually trying anything else, I've only been doing some test setups of
FreeBSD so far, but "something Illumos" has been on my radar as well and I've
been trying to understand the landscape there, along with playing a little bit
(not very seriously, I'll admit) on a SmartOS instance through Joyent's cloud.

Technically it seems quite impressive, and Illumos being more or less the ZFS
upstream is appealing. The distribution situation has been more confusing,
though. OpenIndiana seemed to have a community, which is why I was first
looking into that option, but it does indeed not seem to be very active.
That's one attraction of FreeBSD & Debian, that they're community supported,
and I can with fairly high confidence expect they'll be around in 5 or 10
years supporting the distribution. It's less clear to me how to wade into
Illumos in a way that mitigates risk of the upstream going away. I think
Illumos itself fits that description, with a community that will be around,
but does any individual distribution? There are clearly resources behind
OmniOS and SmartOS, which gives some confidence, but they are also quite
heavily concentrated resources (one company drives each, and it's not
impossible that they could change focus and deemphasize development of the
public distribution).

~~~
snw
Thats a very valid concern. For SmartOS it would be a major blow if Joyent
were to pull all ressources. They have some of the brightest engineers I know
working on it and I have learned a lot from them by just using the system and
hanging out on the irc channel in freenode. Without them the speed of
development would definitly take a hit.

On the other hand they have been - and still are - very good at building a
community around it. There are multiple people and organisations building
their own SmartOS images/derivates ([1], [2]). Pull-requests on github come in
from a diverse enough group that I think it would survive even without joyent.

For OmniOS I don't have enough insight to make that judgement. They have a
very clear and detailed release plan and recently hired some good people just
to work on OmniOS. These things definitly add some confidence but of course is
not a guarantee forever. But if that would be needed one can always buy
support from them.

[1] [http://www.dogeos.net](http://www.dogeos.net)

[2] [http://imgapi.uqcloud.net/builds](http://imgapi.uqcloud.net/builds)

------
smilliken
We've been running ZFS on Linux in production at MixRank for 6 months. We push
it hard and haven't had any issues yet. We're using arrays of 18x Intel 530
240GB SSDs in raidz2. It's performing very well: 697k read, 260k write IOPS;
3768 MB/s read, 1614 MB/s write throughput.

I recommend it, but only after you've done enough research to feel comfortable
with all of ways to configure it and understand the internals from a high
level. For a personal setup though, you can just install it and forget it.

------
nrzuk
Nice build, I have something similar but with 3TB disks. My disks are grouped
into 8 disks as the thought of trying to recover the whole array if that fails
I'd probably cry a little.

Also I invested in 10GbE, at first was only server to server using some SFP+
NICs and DAC leads I picked up off eBay for between £10 and £80. But have
since bought a couple of Dell 5524 switches and a PCIe enclosure for my Mac
and having 650MB/s read/write to my server from the desktop certainly reduces
the waiting around :)

------
gonzo
Used Dell C2100, 2xXeon E5620 (quad core) 96GB ram, Chelsio 10GbE

Two 2.5" 160GB internal SSDs for ZIL and l2arc.

12x4TB drives.

$2000 + Chelsio NIC ($230) plus 12 x 4TB WD Red ($2000)

So for $4250. I get 48TB and a better performing system in 2U.

His system costs 6012 EUR, or about 7900 USD at today's exchange rate. For
$8500 I can fit 96TB in the same space, have 192GB of RAM, 16 cores @ 2.4GHz,
740GB of SSD for ZIL and l2arc.

And redundant power.

~~~
ZenoArrow
You're not comparing correctly. Your system would only have 96TB if you
weren't using any form of RAID-like caching. The guy from the article is using
ZFS caching to improve data reliability.

~~~
gonzo
He has 24 4TB drives in 4U, but only 16GB of RAM, and a single E3-1230 (quad
core) at 3.30GHz, and 4 x 1Gbps networking.

I'm not sure what a "RAID-like cache" is. I know what redundancy is, and I
know how ZFS works.

16GB is likely adequate for his storage needs. He doesn't have that many
clients accessing the storage.

and I could keep two complete copies of the data on two machines, each with
redundant power, etc.

------
simplexion
Reading this just makes me think of this
[http://www.smbitjournal.com/2014/05/the-cult-of-
zfs/](http://www.smbitjournal.com/2014/05/the-cult-of-zfs/)

~~~
kev009
This article is quite ignorant.

ZFS uses integrated software RAID (in the zpool layer) for technical reasons,
not merely to "shift administration into the zfs command set". Resilvering,
checksumming, scrubs are all a unified concept that are _file_ aware, not
merely _block_ aware as nearly every other RAID. The implications of this are
massive and if you don't understand, please don't write an article on file
systems.

For various reasons, the snapshots and volume management are more usable due
to the integrated design and CoW, and also as pillars for ZFS send/receive.

The "write hole" piece is bullshit. ZFS is an atomic file system. It has no
vulnerability to a write hole.

The "file system check" piece is bullshit. Again, ZFS is an atomic file
system. The ZIL is played back on a crash to catch up sync writes. A scrub is
not necessary after a hard crash.

Quite frankly, for any modern RAID you probably should be using ZFS unless you
are a skilled systems engineer and are balancing some kind of trade off
(stacking on higher level FS like gluster/ceph, fail in place, object storage,
etc). You should even use ZFS on single drive systems for checksumming and
CoW, and the new possibilities for system management with concepts like boot
environments that let you roll back failed upgrades.

Hardware RAID controller quality isn't spectacular, and the author clearly has
not looked at the drivers to dish out such bad advice. You want FOSS kernel
programmers handling as much of your data integrity as possible, not corporate
firmware and driver developers that cycle entirely every 2 years (LSI/Avago).
And there's effectively one vendor left, LSI/Avago, that makes the RAID
controllers used in the enterprise.

ZFS is production grade on Linux. Btrfs will be ready in 2 years, said
everyone since it's inception and every 2 years thereafter. It's a pretty
risky option right now, but when it's ready it delivers the same features the
author tries bizarrely to dismiss in his article. ZFS is the best and safest
route for RAID storage in 2014 and will remain such for at least "2 years".

------
XERQ
> My zpool is now the appropriate number of disks (2^n + parity) in the VDEVs.
> So I have one 18 disk RAIDZ2 VDEV (2^4+2) and one 6 disk RAIDZ2 VDEV (2^2+2)
> for a total of twenty-four drives.

That's a catastrophe waiting to happen. EMC recommends no more than 8 disks in
a parity array group (ZFS recommends no more than 9), specifically because
larger arrays cause longer rebuilds, which are more likely to trigger a URE,
in which case your SOL.

~~~
darklajid
Can you expand the acronyms? I'm having the most trouble with URE.

~~~
dsr_
unrecoverable read error.

------
bruce_one
Has anyone tried HAMMER (a DragonFly BSD creation) and have any thoughts on it
compared to ZFS?

~~~
_delirium
HAMMER is currently in the middle of a major rewrite to HAMMER2. That's
supposed to support multi-master cluster usage, in addition to ZFS/btrfs-like
features, so is quite ambitious. The new version is the main focus of full-
time development, but not yet considered production-ready. I think the old
version is stable-ish but was considered a dead-end by the main developer, so
it doesn't have much deployment or development. I would consider the project
research-stage at this point, though it's quite interesting research.

------
Theodores
How many Libraries of Congress is that?

Isn't there still a danger of typing something silly, an 'rsync --delete' type
of command that bricks the whole thing? Rather than the one Borg Cube I think
I would prefer to add disks/machines with disks on an as required basis, or
keep the last system in the loft 'just in case'.

I am still amazed though at how much disk people 'need'. Even in uncompressed
4K resolution it would take a long time to watch 71TiB of video.

~~~
Erwin
He does not mention it, bug ZFS has great support for snapshots (compared to
e.g. LVM snapshots). You don't have to carve out disk space in advance where
snapshots are stored and they are relatively high performance, so you can have
an hourly, daily and weekly rotating set of snapshots (there are some scripts
around for that retention). As the snapshots take up space, your available
disk space just diminishes. And you can set it up so while in you are in dir
"foo", your "foo/.snapshot/16:00/" has the contents of that directory when
shapshotted hourly.

Another cool feature: you can do a send a diff between 2 snapshots as a file
stream (over e.g. SSH) and replay it on another machine. So that's like using
rsync except it's actually a stable snapshot (rsync can't fully handle files
that changes while it's copying them) and you don't have to re-checksum every
part of every changed file. So you can use that for DR.

Transparent compression is also great (as is deduping -- if you have tons of
spare memory!)

BTRFS will be able to do the same thing... one day.

~~~
chrismonsanto
> BTRFS will be able to do the same thing... one day.

It already can. btrfs sub snap does snapshots (and they work the same way),
btrfs send/receive do the diffs, supports lzo/zlib compression and
deduplication

~~~
laumars
The last time I tried Btrfs (which was about a year ago on ArchLinux) I found
Btrfs snapshots to work very differently to ZFS. At least superficially as
most of the guides I could find advocated using _rsync_ to restore snapshots -
which even now I find it hard to believe there isn't a cleaner solution.

Aside all of the positive things already mentioned about ZFS, one of the
biggest selling points for me is just how easy it is to administrate. The
_zfs_ and _zpool_ command line utilities are child's play to use where as
_btrfs_ felt a little less intuitive (to me at least).

I really wanted to like Btrfs as it would have been nice to have a native
Linux solution that would work on root partitions (my theory being that Btrfs
snapshots could provide restore points before running system updates on
ArchLinux) but sadly it proved to be more hassle than benefit for me.

------
storrgie
I have a similar system at home (almost exactly the same except I don't have
the quad network adapter) using 2T drives from Hitachi. I quite love being
able to save everything, and the peace of mind that ZFS brings.

As a plug, I've been using CrashPlanPro for offsite backup, I've been very
happy with their service.

~~~
illumen
Do you also use linux?

~~~
storrgie
ZFS on Linux is a really good project with a wonderful community. I have
happily been using Linux and ZFS together for a long time.

------
grizzles
I'd like something like this but not such a big box. Are there any open source
drobo-like projects out there? I'd like something like a drobo but one that's
hacker friendly. Anyone know of a chassis like that?

~~~
rodgerd
Closest I can think of is the SilverStone DS380, which is a NAS-sized case
that gives you 8 hot-swap SATA bays, 4 2.5" internal mounts, and supports
mini-ITX motherboards.

------
rem0x4
... if you're going to use ZFS, you should really think about using FreeBSD.
Open Solaris would be the obvious choice, but with solaris's lack of support,
FreeBSD is second to none in ZFS support...

------
jspiros
I have a similar capacity solution running at home, though I have far more
(fewer disks per) vdevs. I've been running it for a few years now, so some
things aren't as optimized as they could be, but it still works well for my
uses...

    
    
      # zpool list
      NAME      SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
      lanayru  73.3T  53.7T  19.6T    73%  1.00x  ONLINE  -
    
      # zpool status
        pool: lanayru
       state: ONLINE
      status: The pool is formatted using a legacy on-disk format.  The pool can
              still be used, but some features are unavailable.
      action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
              pool will no longer be accessible on software that does not support
              feature flags.
        scan: scrub repaired 1.58M in 201h47m with 0 errors on Fri Feb 21 13:37:24 2014
      config:
      
              NAME                                                  STATE     READ WRITE CKSUM
              lanayru                                               ONLINE       0     0     0
                raidz2-0                                            ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                raidz2-1                                            ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                  ata-SAMSUNG_HD204UI_______________-part1          ONLINE       0     0     0
                raidz2-2                                            ONLINE       0     0     0
                  ata-ST2000DL004_HD204UI_______________-part1      ONLINE       0     0     0
                  ata-ST2000DL004_HD204UI_______________-part1      ONLINE       0     0     0
                  ata-ST2000DL004_HD204UI_______________-part1      ONLINE       0     0     0
                  ata-ST2000DL004_HD204UI_______________-part1      ONLINE       0     0     0
                  ata-ST2000DL004_HD204UI_______________-part1      ONLINE       0     0     0
                raidz2-3                                            ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68AX9N0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68AX9N0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68AX9N0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68AX9N0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68AX9N0_WD-____________-part1    ONLINE       0     0     0
                raidz2-5                                            ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                raidz2-6                                            ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
                  ata-WDC_WD30EFRX-68EUZN0_WD-____________-part1    ONLINE       0     0     0
              logs
                mirror-4                                            ONLINE       0     0     0
                  ata-KINGSTON_SV300S37A60G_________________-part1  ONLINE       0     0     0
                  ata-KINGSTON_SV300S37A60G_________________-part1  ONLINE       0     0     0
              cache
                ata-KINGSTON_SV300S37A60G_________________-part2    ONLINE       0     0     0
                ata-KINGSTON_SV300S37A60G_________________-part2    ONLINE       0     0     0
      
      errors: No known data errors
    

If anyone has any questions, I'd be happy to try to answer.

~~~
apinnes
Hey mate, I've got a few questions if you don't mind:

How much does it cost in terms of power to have something of that size running
24/7 (I'm assuming it is?)?

Now you've set it up, does it require much of your time to manage monthly?

What sorts of things are you using the space for?

Lastly, how much did it cost in parts/time to get it all up and running, does
it work out cheaper rolling your own or would you have been cheaper using a
cloud solution instead, or does that just not meet your use-case/needs?

~~~
jspiros
It is running 24/7; I don't feel comfortable powering it down regularly; that
is something I would worry about from the OP's setup, I wouldn't want to
subject all those mechanical drives to so many power cycles over time. I don't
have figures for only that machine, but my entire rack, which includes a
router machine, two ISP modems, the ZFS-running machine and two SAS expanders,
averages around 330 watts.

After setting it up, I wouldn't say that it requires any time to manage.
Getting it all set up just right, with SMART alerts and capacity warnings and
backups and snapshots, all of which I roll myself with various shell scripts,
took a long time. Besides that initial investment, the only "management" I
have to do is respond to any SMART alerts, add more vdevs as the pool fills
up, and manage my files as I would on any other filesystem.

I use the space for just about everything. Lots of backups. I use the local
storage on all of my workstations as sort of "scratch" space and anything that
matters in the long term is stored on the server. The highest density stuff
is, of course, media: I have high definition video and photographs from my
digital SLRs, I have tons of DV video obtained as part of an analog video
digitization workflow, media rips, and downloaded videos. (I even have a
couple of terabytes used up by a youtube-dl script that downloads all of my
channel subscriptions for offline viewing, and that's something I doubt anyone
would do unless they had so many terabytes free.) I keep copies of large
datasets that seem important (old Usenet archives, IMDB, full Wikipedia
dumps). I keep copies of old software to go with my old hardware collection. I
have almost every file I've ever created on any computer in my entire life,
with the exception of a handful of Zip and floppy disks that rotted away
before I could preserve them, but that is only a few hundred gigabytes. I scan
every paper document that I can (my largest format scanner is 12x18 inches, so
anything larger than that is waiting for new hardware to arrive someday), so
almost all of my mail and legal documents are on there too.

(I had a dream the other night that someone got access to this machine and
deleted everything. Worst. Nightmare. Ever.)

A cloud solution would not have met my use case, since one of the primary
needs I have is to be self-sufficient in terms of storing my own data, and I
also want immediate local access to a lot of the things on there. I do use
various cloud solutions, but only for backup, never as primary storage.

Rolling it myself was definitely cheaper than any out-of-the-box hardware
solution I've seen. The computer itself is a Supermicro board with some Xeon
middle-of-the-range chip and a ton of RAM, and an LSI SAS card. Connected to
the SAS card are two 24-bay SAS expander chassis, which contain the drives,
which are all SATA.

I'd say that building something like this would cost you maybe about 4000USD,
not counting the cost of the drives. The drives were all between $90 and $120
when I bought them, but of course capacity eventually started going up for the
same price over time, so let's say another 3500USD for the drives.

~~~
flexd
Would you recommend building something like this for a much smaller system?
10TiB or so maybe, I do not need that much, or do you think buying a NAS of
some kind would be better?

I kind of want to set something like this up while spending the least amount
of money. I am comfortable enough with Debian/Linux to do most things, but I
have never managed anything like this. In the end I want to end up with
somewhere relatively safe to store data pretty much in the same way you are, I
just do not need 70TiB, and I have no experience with ZFS/hardware
stuff/storage.

~~~
jspiros
By "something like this", do you mean ZFS? I am a HUGE fan of ZFS, and I do
think that it's worth using in any situation where data integrity is a high
priority.

As far as ZFS on Linux, it still has its wrinkles. I use it because, like you,
I'm comfortable with Debian, and I didn't want to maintain a foreign system
just for my data storage, and I still wanted to use the machine for other
things too. (I actually started with zfs-fuse, before ZFS on Linux was an
option.)

So, I don't know. If you just want a box to store stuff on, you might want to
just look into FreeNAS, which is a FreeBSD distribution that makes it very
easy to set up a storage appliance based on ZFS. FreeBSD's ZFS implementation
is generally considered production-ready, so you avoid some ZFS on Linux
wrinkles, too.

So, I'd recommend checking out the FreeNAS website, and maybe also
[http://www.reddit.com/r/datahoarder/](http://www.reddit.com/r/datahoarder/)
for ideas/other opinions. I do a lot of things in weird idiosyncratic ways, so
I'm not sure I'd recommend anyone do it exactly how I have. :)

~~~
laumars
If you're comfortable with Debian then you shouldn't have too many issues with
FreeBSD as there is a lot of transferable knowledge between the two (FreeBSD
even supports a lot of GNU flags which most other UNIXes don't).

Plus FreeBSD has a lot of good documentation (and the forums have proven a
good resource in the past too) - so you're never going it alone (and obviously
you have the usual mailing groups and IRC channels on Freenode).

While I do run quite a few Debian (amongst other Linux) I honestly find my
FreeBSD server to be the most enjoyable / least painful platform to
administrate. Obviously that's just personal preference, but I would
definitely recommend trying FreeBSD to anyone considering ZFS.

~~~
jspiros
As far as I'm concerned, the most identifiable characteristic of Debian is the
packaging system, dpkg/apt. I've used FreeBSD occasionally, and that's what I
always end up missing about Debian. I did consider going with Nexenta or
Debian GNU/kFreeBSD, but whatever, ZoL works well enough. :)

~~~
laumars
FreeBSD 10 has switched to a new package manager, so it might be worth giving
it another look next time you're bored and fancy trying something new.

I can understand your preference though. I'm not a fan of apt much personally,
but _pacman_ is one of the reasons I've stuck with ArchLinux over the years -
despite it's faults :)

~~~
jspiros
I'll keep that in mind; I do sometimes find myself with some time to play with
things. :)

------
13throwaway
Can somebody give me some recommendations on how to do this with encryption? I
am fine sshing into my server and putting in a password after reboot.

~~~
icebraining
[http://serverfault.com/a/588056](http://serverfault.com/a/588056)

------
newman314
For people interested in a ZFS build, there is zfsbuild.com

------
roncohen
"The purpose of this machine is to store backups and media, primarily _video_
"

o_O

~~~
storrgie
Took a 10 day vacation with a single DSLR shooting RAW... 60GB of
photos/videos. I've not gone through and deleted anything yet... but I don't
really have to either.

~~~
jacquesm
That's 1000 such vacations!

~~~
maaku
Better get busy.

