
Building the next generation file system for Windows: ReFS - jhack
http://blogs.msdn.com/b/b8/archive/2012/01/16/building-the-next-generation-file-system-for-windows-refs.aspx
======
ComputerGuru
_The NTFS features we have chosen to not support in ReFS are: named streams,
object IDs, short names, compression, file level encryption (EFS), user data
transactions, sparse, hard-links, extended attributes, and quotas._

Of these, I'm sorry to see the demise of sparse files. This was, IMHO, the
single most under-utilized feature of NTFS, and I was able to integrate
support for sparse files into a number of clients' applications (I'm a low-
level consultant and developer) to great effect. While the increasing size of
volumes along with the sub-par utilization of this feature makes it an obvious
victim when creating a new filesystem and looking for features to drop, sparse
files can be amazing for other reasons.

One of the advantages of sparse files is that they can be used to naively
support certain seek-related behaviors. If you create the file right, you can
save yourself a lot of code and complexity in any applications consuming that
data.

The biggest advantage of sparse files though is speed. For instance, you can
create a container file of X size filled with zero bytes, and only use as much
space as the end application requests (for example, creating a virtual disk of
2TB that only takes up 100MB on disk).

I, for one, am sad to see this feature go. For anyone interested in this
amazing feature, have a read here:
<http://www.flexhex.com/docs/articles/sparse-files.phtml>

~~~
CPlatypus
I agree that it's a strange set of choices. Eliminating quotas and compression
might be OK for something positioned as a basic filesystem, but for something
billed as a "next generation" filesystem it seems odd. Eliminating extended
attributes is an even bigger step backward, because they're such a useful
building block for other OS features (e.g. look at their use to in Linux to
support ACLs and security labels).

Failing to support sparse files, though . . . man, that's just insane. That
would relegate ReFS to the status of a toy in most filesystem developers'
minds, even before you consider their increasing usefulness when storing
virtual-machine disk images in a shared filesystem to support migration, etc.
It's hard to imagine that none of the many people who must be involved in this
at MS raised the red flag. What seems more likely, from what I know of MS
culture, is that some people did raise it but then some idiot dictator with a
reputation built on some long-irrelevant project ignored or dismissed their
objections.

~~~
sfoskett
If ReFS is really in the Server version of 8, and if it really doesn't support
sparse files, consider the implication of built-in deduplication. It does
sparse files at a layer above the filesystem, along with dedupe and
compression. Or, as mappu points out, ReFS could assume it's running on
VHD/CSV, punting those features to a lower level.

Perhaps Microsoft is making a decision to focus the filesystem on being a
simple storage engine and moving features into other modules (primarily) above
or even below it?

~~~
CPlatypus
Dedup isn't _quite_ the same as sparse files. If the filesystem is unaware of
dedup, then it still has to allocate its own structures corresponding to the
dedup'ed space, seeking to the next hole or next allocated block won't work
(not that many applications are smart about holes or that MS didn't already
suck in that area), etc. Dedup - even in its most absolutely simple form of
just detecting zero blocks - does at least avoid the fatal problem of
allocating actual disk space uselessly, but sparse files are still basically a
filesystem problem and need to be treated as such.

------
Someone
Some loose remarks:

\- named streams are out => it becomes unlikely that we will see these become
popular on any OS (because being incompatible with the market leader is
problematic; see Mac OS X, .DS_Store). I find that a pity.

\- I guess quotas are out because there will be something else replacing it?

\- Can anyone explain why a modern filesystem should have a limitation on path
length? For APIs I can understand it because the standard C library thinks
paths are fixed-length, but for file systems? I would think this complicates
the implementation, as every directory would need to know the length of the
deepest path below it (in case one attempts to rename it). Aggregating that
info upwards whenever a file is created or renamed (let alone deleted) cannot
come for free, can it?

~~~
wvenable
Named streams were always problematic especially in a connected heterogeneous
computing environment. I'm not sad to see them go.

~~~
Someone
But that is chicken and egg. I would rather see a world where we worked on
solving those problems, instead of giving up on them. On the good side, they
are dropping support for 8.3 filenames (but it will be interesting to see how
they solve the 'copy from this new file system to FAT' problem)

------
blibble
sounds very much like the "current generation" to me, ZFS has done just about
everything that article covers for a while, and it supports most of this too:

"The NTFS features we have chosen to not support in ReFS are: named streams,
object IDs, short names, compression, file level encryption (EFS), user data
transactions, sparse, hard-links, extended attributes, and quotas."

~~~
1010011010
Too bad they can't just use ZFS, but, NIH and all.

~~~
icebraining
I think Microsoft's legal department would freak out about having a (even if
"slightly") copyleft licensed software in the Windows core.

------
Game_Ender
He closes with: "We believe this significantly advances our state of the art
for storage."

I don't think that's true at all. As others have mentioned, it appears they
are matching the state of art achieved by ZFS.

~~~
rcthompson
He says _our_ state of the art. I guess by that he means the state of the art
of storage in Windows file systems.

------
daniel02216
I'm not sure I see the difference between a log-structured file system and
what they have proposed for their robust disk update strategy, especially when
you add integrity streams into the picture. Anyone with more filesystems
knowledge than me want to clarify this?

~~~
obtu
Their approach looks a lot like BtrFS: everything in a BTree, append-only,
with checksums.

------
jensnockert
Seems very cool, the only problem seems to be that it isn't bootable. I hope
that this might get the Linux folks a bit more serious about modern resilient
filesystems.

~~~
mbell
Most or all of what was discussed is supported by ZFS (which you can use via
FUSE in linux) and the vast majority is either already available or part of
the planned feature set for btrfs, which will probably be the primary FS of
linux moving forward.

The 'Linux folks' have been working on this for quite awhile, if it weren't
for licensing incompatibilities with ZFS, they'd likely already have it.

~~~
ComputerGuru
_The 'Linux folks' have been working on this for quite awhile, if it weren't
for licensing incompatibilities with ZFS, they'd likely already have it._

If you'll pardon my saying so, that's because it's _not_ 'the Linux folks'
that have been working on it - it's other non-'Linux folk' companies ( _cough_
RIP, Sun _cough_ ) that took it upon themselves to make a better filesystem
for their (coincidentally, and nothing to do with Linux) open source operating
systems. The closest the 'Linux folks' got was ReiserFS under Hans Reieser,
who's work was largely rejected by the mainstream 'Linux folks' working on the
kernel... until the months just before his arrest and conviction for the
murder of his wife.

ZFS has as much to do with Linux as NTFS has to do with Linux - developed
wholly outside of the Linux community by people not in the Linux community nor
associated with the Linux community, with implementations available for Linux
that are not redistributable with the kernel for patent- or licensing-related
reasons.

But, yes, BtrFS (developed by Oracle) is indeed a 'Linux folk' attempt at
creating a modern filesystem. And BtrFS does indeed predate ReFS.

~~~
mappu
Note that booting from BtrFS is pretty difficult, if at all possible, putting
it on similar footing to ReFS/Protogon.

~~~
mbell
The GRUB update in May 2011 added support for booting both ZFS and btrfs. If
your having issue all you need to do is update grub to get proper support.

------
rbanffy
Wasn't it supposed to arrive in Vista?

Now, seriously, if I got a dollar for every new Windows filesystem announced
for every next version of Windows and canned before launch, I'd be at least
five dollars richer. By the time they deliver it, IF they deliver it, BtrFS
will be widely used in Windows servers. ZFS already is way more advanced than
what they propose.

The only major change I saw was when Microsoft ditched HPFS to go with NTFS.

~~~
4ad
NT was designed to use NTFS from the start. Because NT was originally NT OS/2,
it was also supposed to be able to use HPFS, but NTFS was always the primary
filesystem.

BTRFS on Windows?

And yes, ZFS is more advanced, at least from what can be deduced from this
article.

~~~
rbanffy
> BTRFS on Windows?

Sorry. Editing accident. Can't correct it anymore. I meant Linux servers, of
course.

------
kirrmann
Now finally Windows FS gets some love.

~~~
catch23
Seems like the opposite of love -- they're removing a large swath of features
in order to bring some new ones in.

~~~
justncase80
Frankly, we don't see feature removal nearly enough these days. A move towards
simplicity seems like a win.

~~~
catch23
Simplicity is not so useful in operating systems or filesystems though. There
are enough "simple" filesystems out there -- the useful filesystems are the
ones that aren't so simple. There's lots of great educational "simple" OSes,
but nobody uses them for real work.

