
Apple kicks ZFS in the butt - joao
http://blogs.zdnet.com/storage/?p=584
======
tvon
Summary: There is still no ZFS in Snow Leopard, we still don't know why. Let's
speculate.

~~~
mmphosis
I'll speculate: Testing.

* maybe with ZFS, there were problems like slow speed, unreliability, poor resource use, etc... more speculation of Apple's testing benchmarks...

* maybe without ZFS, no problems were found in testing. If this was the case, and remember I am speculating, I would ship without ZFS, easy.

------
pohl
I was looking forward to ZFS, but there was a recent article about BTRFS that
caught my eye.

What I would really like to see, though, is some modern filesystem with a
license that is neither offensive to a commercial vendor like Apple or
Microsoft, nor offensive to the GNU crowd, or the BSD crowd. I'm not sure
that's even possible, but it would be nice to know that I could format a flash
drive in a universal way that isn't FAT32.

~~~
rbanffy
"format a flash drive in a universal way that isn't FAT32."

Microsoft will never, ever support a technology they don't control. FAT32 is
just fine by them and you can't count them out when you say "universal".

They could, of course, use BSD licensed stuff, but they would not be able to
put pressure and extract licensing fees like they do with FAT.

------
barrkel
I'm running ZFS on OpenSolaris kernel (Nexenta, for GNU userland) for home
storage, with my main zpool at 9.06TB spread over 8 hard drives, for almost
7TB worth of ZFS storage.

ZFS isn't quite what's billed to be, but it's (mostly) better than every other
current choice. You can read the positives everywhere else, but let me point
out some of the negatives.

* Space wastage: block sizes for files grow up to 128K, but they never shrink, and the last block in a file is currently always the same as the overall file block size. This means that files e.g. 129K in size will have 127K of wasted space on disk, and files that were once 128K and truncated back down to 1 byte will still have 128K block size. Depending on the kinds of files, you can see space wastage rates in the region of 25-45%. This isn't theoretical: I had to reduce max block size to 8K on one filesystem to avoid huge wastage.

* Streaming latency: as a media server, I often play movies from data stored on this server, but there's a problem somewhere in the stack (network, OS file cache, filesystem implementation). If I'm playing a movie or TV episode from a ripped DVD (i.e. VIDEO_TS directory) I can guarantee that there'll be at least one glitch in every 40 minutes of playback, where network transfer has dropped to zero for 3 or 4 seconds.

* Device removal: you can add more storage to a ZFS pool, and one can replace any given device with a drive equal or larger in size, but it's not possible to remove a device. If you start out with 8 disks, you're stuck with 8 disks unless you want to backup and restore the entire FS - but at least FS streaming is easy with e.g. zfs send <snapshot> | ssh <remote> zfs receive <filesystem>.

* Fragmentation: there is no defragmentation solution currently. Sure, people say it's "not an issue", that "ZFS is not pathalogical", but the same was said about NTFS, and NTFS fragments like crazy when small incremental writes are used. ZFS performance is known to fall off a cliff (e.g. less than 1% of prior perf) as you approach capacity (symptomatic of fragmented free space) and, more importantly, _stay_ abysmal until you back off quite a bit and preferably drop a whole file system or two. Recommendation is to stay under 80%, which isn't very full.

* Performance: performance is a bit of a red herring; you can make ZFS as fast as you want by adding mirroring and striping. The more interesting case is when you don't have mirrors and stripes out the wazoo, and are relying on RAIDZ or RAIDZ2, and here the performance depends on access patterns. The way the parity and checksumming works, you need to burn a lot of cache to get decent small random read/write performance, because entire blocks need to get read and written even for small updates. So performance looks great for a while, then falls off.

I still think ZFS is one of the best choices for this kind of application -
the software RAID, multiple-volume handling is terrific - but it's far from
the last word, and not 100% baked (needs defragmentation and data recovery
tool ecosystem). BTRFS looks like it could be a very worthy contender, but
from what I've read, I haven't seen that it values a storage-
unit/pool/filesystem stack, where storage-unit can be a disk, a file or a
partition, and still seems stuck on the old device/filesystem approach.

~~~
ciupicri
What case are you using? Also, what are you using for cooling?

~~~
barrkel
Just an average case, Gigabyte 3D Aurora full tower. Here's a review:

[http://www.cluboverclocker.com/reviews/cases/gigabyte/aurora...](http://www.cluboverclocker.com/reviews/cases/gigabyte/aurora/index.htm)

HDs are both in the internal HD bays, which have a dedicated fan, and in
3.5/5.25 adapter in the 5.25 bays up top, with active cooling, it's something
like this:

<http://www.pcstats.com/articleview.cfm?articleID=2313>

------
philwelch
Apple has a history of working on features and then dropping them. They had
one in the works for years where your home directory would be sync'd to your
iPod so you could plug into any Mac anywhere and get it all back.

It's more than likely that ZFS meant biting off more than they could chew, so
they deferred it for another release.

~~~
kylec
You can still get at the home sync feature if you manually enable it - it can
be found at /System/Library/CoreServices/Menu Extras/HomeSync.menu

~~~
duskwuff
The menu extra stub is still present, but there's no way to configure it --
the Accounts preference pane doesn't have the options that the menu extra
refers to.

~~~
kylec
You can get at them if you enable the menu item, then select "Mobile Account
Preferences" from the dropdown. However, this is as far as I've gotten because
I don't have the ability to test to see if this works.

~~~
duskwuff
Oh, wow, you're right. (Tip: You have to unlock the Accounts pref pane first
to make it show up.) I haven't tried this out, but I almost want to now.

~~~
dhess
I don't think that's what you think it is. The Home Sync menu I'm familiar
with is for mirroring a network home directory on your local disk, in
conjunction with another machine running Mac OS X Server. As far as I know,
this feature has nothing to do with portable home directories on iPods.

------
jeromewbrock
2 things:

1\. Sun's shared source license sucks and this is why ZFS isn't natively
supported in Linux. 2\. btrfs will soon have better
features/performance/stability than ZFS, and a better architecture to boot:
<http://lwn.net/Articles/342892/>

~~~
jodrellblank
"Soon", maybe, but it will take a long time for it to be Generally Recognised
As Reliable. (For all I argue against buying I to a brand just because, I do
like ZFS more with Sun's name behind it)

------
jsz0
I don't really understand the hype around ZFS. What's so great about it?

I've tried it out on OpenSolaris a few times in an attempt to learn why it's
so hyped up. The setup process is a bit cryptic and confusing. Even after the
initial setup I was really confused about what commands were destructive and
which ones were not. Based on that alone I would not consider using ZFS yet.
The flexibility of ZFS was a bit lost on me. It seems like there's a ton of
limitations and caveats to consider. I can't really comment on performance or
reliability since this was a short lab test but in the last 5 years or so I've
never lost a HFS+, EXT3 or NTFS file system so I'm not sure just how much more
reliable ZFS can be. Is all this hype just acronym lust? I feel like a good
RAID card combined with a semi-modern FS is still a better solution. It's
certainly easier to setup IMO.

~~~
enneff
ZFS allows RAID-5 without the "write hole" (google it). That's the killer
advantage for me. The rest is just gravy.

I'm amazed you found set-up cryptic. Once I understood the concepts of ZFS
pools and filesystems, I found the toolset among the most elegant and well-
designed out there.

I think it might actually be a little daunting in it's simplicity, maybe.
("that's it?!")

~~~
jsz0
I think that might be it. I found it a bit hard to visual what was actually
happening with these commands since they are all pretty simple.

------
mustpax
Maybe Oracle is less willing than Sun to license their core differentiating
server feature to competitors.

I believe Time Machine is still built on top of ZFS snapshot though. So
obviously some ZFS code remains.

Edit: Apparently Time Machine is not based on ZFS. So it seems ZFS is
completely off the map after all.

~~~
Locke1689
Actually,Time machine is not and has never been based on zfs. It is
implemented using hfs+, journaling, and a kernel/fs modification for
journaling and hard links. A time machine snapshot is basically modified hard
link journals.

~~~
rbanffy
The sad part is that Time Machine would be a one-liner had OSX supported ZFS
since Leopard...

~~~
glhaynes
Which one line (heck, doesn't have to just be one) command would you issue to
get Time Machine-like functionality from ZFS?

ZFS contains a lot of features that _sound_ like they overlap with Time
Machine, but in practice they really don't in a way that would make Time
Machine a "free" feature.

~~~
furyg3
I implemented a simple personal backup solution a long time ago which is
(essentially) what Time Machine implements now (without the eye-candy
application, of course). It's not that hard, it only requires hard-linking,
which the OS X kernel didn't support, so they added it.

Steps: First run:

* Rsync data from source drive to backup drive (can be done over network, ssh, whatever). We'll call this backup "0"

Second run:

* On the backup drive, copy the last backup folder (backup "0") to a new backup folder (backup "1"), using hardlinks. Hardlink copies don't take any additional space, since they're pointing to the original resource on the disk (folders do add some negligible space).

* Rsync data from source drive to backup drive (backup "1"). Rsync, of course, only updates what's new/changed.

Repeat, saving as far back as you'd like, and deleting old backups if the
drive is full, after N backups, or so many days. Now you've got an easy
incremental backup bash script.

Hardlinks are beautiful, because they don't add extra space, and once all the
hardlinks to a resource are deleted the resource becomes free space. This
isn't an enterprise-level backup solution, but it's a great quick-and-dirty
way to do incremental backups, and is exactly how Time Machine works.

~~~
glhaynes
One way this isn't exactly like Time Machine: since it doesn't hook into the
FSEvents journal of file system modifications, rsync has to check the modified
dates of every file in the backup set. With FSEvents, Time Machine knows
what's changed (or at least which folders contain changed items) and can deal
with only those.

But, yeah, same basic concept for sure.

------
dmaz
If the point of 10.6 was to add improvements to the OS without major end-user
changes, then ZFS could have been tested in the development cycle and then put
on hold for the next major release.

------
miracle
Well, they can integrate it in their next service pack then and charge another
50$! :-)

~~~
Herring
It's going to cost $50 regardless, so why work on ZFS? I wonder if they've
thought of just charging money for no features.

