
Paragon submits 27k-line NTFS driver to Linux kernel - kiyanwang
https://www.theregister.com/2020/08/18/paragon_tries_to_contribute_ntfs/
======
piscisaureus
I understand that reviewing 27k LoC is daunting and probably not very fun.

But unlike most patches that draw a similar response it's not a narrowly
useful patch that mostly serves the submitter. Proper NTFS support benefits a
large proportion of Linux users (the jab from the article that there are more
advanced file systems out there seems out of place; there are no signs that
windows is about to switch its default FS to something else).

Additionally this code has been used in production for years now (e.g. my 2015
router runs the closed source version of this driver in order to support NTFS
formatted external drives) so most likely a lot of quality issues have already
been found and addressed.

So I feel it's a bit unreasonable to respond with so much negativity to this
contribution.

~~~
AnotherGoodName
It's also no big deal from either side. Paragon sent in the patch and it's
appreciated. There's a few problems to get this in. Reviewers noted the issues
and what would need to be done to get this through. The process to get this in
is happening.

Split your diff! and Fix your makefile! have to be one of the most benign and
common pieces of diff feedback i've seen. I feel that you could make a media
story about any submission to the Linux kernel based on there being comments
in the review process.

~~~
piscisaureus
Admittedly I didn't actually read the mailing list discussion. It's entirely
possible that The Register made up a big drama where there was none.

~~~
mehrdadn
Not sure if this counts as drama per Linux kernel mailing list standards:
[https://lore.kernel.org/linux-
fsdevel/2911ac5cd20b46e397be50...](https://lore.kernel.org/linux-
fsdevel/2911ac5cd20b46e397be506268718d74@paragon-
software.com/t/#m34fc84d93a7009caf533ff400f7c763da4cab366)

> So how exactly do you expect someone to review this monstrosity ?

~~~
isatty
It’s a legitimate concern - I would not assume malice. How exactly would
someone review a 25k loc .patch file?

~~~
michaelt
It would be tough, no doubt.

But it's not like splitting the feature into 100 patches of 250 lines each
would make it any quicker to review. Or merging code that was known not to
work, as it was only a fraction of what was needed for the functionality.

~~~
wtallis
> But it's not like splitting the feature into 100 patches of 250 lines each
> would make it any quicker to review. Or merging code that was known not to
> work, as it was only a fraction of what was needed for the functionality.

That would _also_ be rejected, because the kernel maintainers aren't idiots
and their standards aren't the stupid arbitrary rules you construe them to be.
They generally want big changes to be broken up into logical, sensible chunks
that each leave the tree in a usable state, so that git-bisect still works.

~~~
skissane
How do people merge big new filesystems in practice though? Especially one
with years of pre-existing out-of-tree development?

I guess one could start by merging a skeleton of the filesystem which supports
mount/unmount but then returns an IO error on every operation? And then a
patch to add directory traversal (you can view the files but not their
contents), and then a patch to add file reading, and then a patch to add file
writing, and then a patch to add mkdir/rmdir, and then a patch to add
rename/delete of regular files.

Breaking down an existing filesystem into a sequence of patches like that, no
doubt it is doable, but it is going to be a lot of work.

~~~
wtallis
My guess is that given the history of this filesystem implementation, most of
the review effort will be focused on the interface between this FS and the
rest of the kernel. It's typical for all the changes touching communal files
or introducing generic helper functions or data structures to be broken out
into separate commits. If any of those helpers are a reinvention of stuff
that's already in the kernel, there will need to be a justification for why
NTFS needs its own special versions. It's _not_ typical for a large patch
series adding genuinely new stuff to be broken up into absurdly tiny commits.
For the stuff that's truly _internal_ to the filesystem implementation, it
looks like one patch per file will be an acceptable granularity.

------
cactus2093
It's strange to me that there's no file system with basic features like
journaling and support for files larger than a couple of GB, that is supported
across all major desktop OSes (MacOS, Windows, Linux, and FreeBSD). All
platforms support the same i/o standards like USB or DisplayPort, why did
filesystems never make the cut to become a cross-system standard?

Imagine if you could have a backup drive (with reasonable modern data
protections) that you could just plug into different systems and save all your
files to. Isn't it odd that such a simple thing isn't possible? I guess
network attached storage has gotten pretty accessible at this point so there's
no need for it?

~~~
quarantine
There's a ZFS driver for Linux, macOS, and Windows.

~~~
hoistbypetard
ZFS development moves so fast that it is common for my (FreeBSD-based) FreeNAS
box to warn me when I upgrade my OS that certain actions will make it
incompatible with the prior version of FreeNAS.

That is fine and appropriate for a drive that will be connected to the system
for the foreseeable future.

That kind of compatibility concern makes me squeamish about using ZFS for a
drive that I want to share between different systems. If it's easy to make it
incompatible between two releases for the same system, that smells like a
waiting nightmare trying to keep it compatible between Linux, FreeBSD and
Windows.

~~~
DenseComet
Yeah it should stabilize in the next couple of months with the release of
OpenZFS 2.0 as that release is supposed to signify the unification of ZFS on
Linux and FreeBSD. ZFS on FreeBSD is being rebased onto ZFS on Linux. Theres
also been some talk on adding MacOS zfs support OpenZFS but thats still up in
the air.

~~~
mycall
This is great news

~~~
codetrotter
Agreed! I’ve been running FreeBSD on various computers for very close to a
decade now, and still run it on my mail server, but one problem that I faced a
couple of years ago when I sold my old laptop, which I was running FreeBSD on,
was that my other computer at home at the time was running Linux but I had an
external HDD that I’d been using with the laptop and which I was using GELI
encryption on.

Since I didn’t have money for any more hard drives at the time, I couldn’t
transfer the data to anything else. So then when I wanted to access that data
I’d do so via a FreeBSD VM running in VirtualBox. The performance was... not
great.

I took the data that I needed the most, and for the rest of the data I let it
sit at rest.

This week I wanted to use the drive again, and in the end because I was doing
general cleanup, I decided to install FreeBSD on my desktop temporarily.

I actually love FreeBSD but the reason that I prefer to have my desktop
running Linux is in big part because I want software on the computer to be
able to take advantage of CUDA with the GTX 1060 6GB graphics card that I have
in it, and unfortunately only the Linux driver by Nvidia has CUDA, the FreeBSD
driver by Nvidia does not.

I was actually looking at installing VMWare vSphere on the computer instead,
so that I could easily jump between running Linux and running FreeBSD with
what I understand will probably be good performance compared to VirtualBox at
least. But the NIC in my machine is not supported and vSphere would not
install. I found some old drivers, messed around with VMWare tooling which
required PowerShell, and which turned out not to work with the open source
version of PowerShell on any other operating system than Windows. So then I
downloaded a VM image of Win 10 from Microsoft [0], and used that to try and
make a vSphere installer with drivers for my NIC. No luck at first attempt
unfortunately. A decade ago I probably would have kept trying to make that
work, but at this point in my life I said ok fine fuck it. I ordered an Intel
I350 NIC online second-hand for about $40 shipping included, and the guy I
bought it from sent it the next day. It is expected to arrive tomorrow.
Meanwhile, I installed FreeBSD on the desktop. When the NIC arrives I will do
some benchmarking of vSphere to decide whether to use vSphere on the desktop
or to stick to either FreeBSD for a while on that machine or to put it back to
just Linux again.

Anyways, that’s a whole lot more about my life and the stuff that I spend my
spare time on than anyone would probably care to know :p but the point that I
was getting to is that, with OpenZFS 2.0 I will be able to use ZFS native
encryption instead of GELI and I will be able to read and write to said HDD
from both FreeBSD and Linux.

I still need to scrape together money for another drive first before I can
switch from GELI + ZFS to ZFS with native encryption though XD

Oh, and one more thing, with the external drive I was having a lot of
instability with the USB 3.0 connection on FreeBSD, leading to a bit of pain
with transferring data because the drive would disconnect now and then and I’d
have to start over. But yesterday I decided to shuck the drive – that is, to
remove the enclosure and to connect the drive with SATA like you would any
other regular internal drive. It worked out excellently, the WD Essentials
enclosure was easier to pry open than I had feared, and a video on YouTube
showed me how to do it [1]. As prying tools I used a couple of plastic rulers.
As a bonus, it also looks like I/O performance is better with the direct SATA
connection than what I was getting with the USB 3.0 connection.

Speaking of that, some people have reported finding that the drives in their
WD Essentials external drives were WD Red HDDs. I didn’t have the same luck
with mine; mine was WD Blue. But idk if WD Red is even common with the
capacity that mine has anyways. Mine is “only” 5TB and I think the people that
have been talking about finding WD Red drives in theirs has bought 8TB models
often. Idk. The main thing for me anyways is just to have my data and
someplace to store it ^^

[0]: [https://developer.microsoft.com/en-us/microsoft-
edge/tools/v...](https://developer.microsoft.com/en-us/microsoft-
edge/tools/vms/)

[1]: [https://youtu.be/QApvLyorr3g](https://youtu.be/QApvLyorr3g)

------
sam42
In the meantime v2 of the patch has already been submitted, addressing some of
the points mentioned in the article: [https://lore.kernel.org/linux-
fsdevel/904d985365a34f0787a451...](https://lore.kernel.org/linux-
fsdevel/904d985365a34f0787a4511435417ab3@paragon-software.com/)

------
nippoo
This is such an odd article. Perhaps Paragon isn't being entirely altruistic
with this move but TR are being quite scathing of someone's work submitting a
kernel driver with no direct financial reward - and no real praise for,
hopefully, fixing one of the biggest out-of-the-box gripes that Windows /
Linux desktop dual-boot environments have. It's no small wonder people often
complain about the hardships of writing and maintaining open-source software!

~~~
shakna
I don't get where TheRegister is getting the drama. The thread doesn't seem
that scathing to me. [0]

It doesn't build, but the person who pointed it out also supplied a diff to
make it happen.

It also fails a few tests, but Paragon are more than happy to see if they can
make it a bit more compliant.

UBSan finds a few potential bugs, but again, Paragon are more than happy to
fix the problems.

There's some style guide suggestions, which Paragon seem to immediately take
on board:

> The patch will be splitted in v2 file-wise. Wasn't clear initially which way
> will be more convenient to review.

[0]
[https://lore.kernel.org/lkml/2911ac5cd20b46e397be506268718d7...](https://lore.kernel.org/lkml/2911ac5cd20b46e397be506268718d74@paragon-
software.com/)

~~~
Delk
Yeah, The Register can often be witty and irreverent, and I wouldn't complain
about that, but that kind of wit only works if there's an actual point behind
it. Jumping on the apparently popular bandwagon of squeezing out "drama" from
the Linux kernel mailing list isn't adding anything of value.

It also seems a bit ignorant to call the submission "half-baked" just because
it needed to be worked on and because someone pointed that out (also) in a
slightly irreverent way.

All of those points you bring up about the feedback they got seem like just
business as usual for a larger merge of code to a carefully developed FOSS
project.

------
DHowett
LKML post:
[https://lore.kernel.org/lkml/2911ac5cd20b46e397be506268718d7...](https://lore.kernel.org/lkml/2911ac5cd20b46e397be506268718d74@paragon-
software.com/)

Earlier discussion:
[https://news.ycombinator.com/item?id=24170001](https://news.ycombinator.com/item?id=24170001)

~~~
cozzyd
Thanks, maybe it was the expectations set by the tone of The Register article,
but the mailing list discussion seemed mostly very reasonable...

------
mauvehaus
Possibly a dumb question, but what are the plusses/minuses of something like
this as compared to NTFS-3G?

[https://en.wikipedia.org/wiki/NTFS-3G](https://en.wikipedia.org/wiki/NTFS-3G)

~~~
dTal
You'll be able to install Linux on NTFS root now!

~~~
mehrdadn
I wish Linux on NTFS could share metadata with WSL1. That way you could boot
it off the same WSL1 files you could access in Windows.

------
mariuolo
I don't think it's completely fair.

Don't look a gift horse in the mouth, especially when you need one.

~~~
mason55
This is like giving someone a litter of puppies as a gift and then walking
away.

Hooray, free puppies. Now you just have to care for them for the next 10
years.

~~~
setr
Paragon stated their intent to maintain and support the code in their initial
email, and added themselves to the maintainer files.

There's no drama happening here. Paragon guys are trying to give it
"properly", and linux guys want it "properly", and the only thing happening is
defining "proper" in this context

~~~
aronpye
Intent, promises and reality aren’t always aligned. Especially when given a
slap dash of tens of thousands of lines of code that didn’t meet kernel
contribution guidelines.

~~~
setr
Thats true... but nothing has happened yet. No one has yet failed to hold up
their end. All parties seem to want this to work -- it certainly isn't a dead
drop.

------
userbinator
I wonder if the reason for the huge line count is overly verbose code, or if
it's just the inherent complexity of NTFS. For contrast, I wrote a FAT32
filesystem driver (read/write) for an embedded system a long time ago, and it
was less than 1K lines --- of Asm.

~~~
The_Colonel
NTFS is way more complicated (and feature rich) filesystem than (rather
barebones) FAT32.

------
prirun
It's just a single data point, but I tried Paragon's ext4 driver (the paid
version) for Mac many years ago. It seemed to work, but when the drive was
connected back to Linux, there were all kinds of fsck errors. Immediately
deleted it.

~~~
apfsx
This has also been my exact experience. I used Paragon's NTFS driver (paid)
for Mac for my external SSD. After using, when plugging into Windows it would
always find "errors" and recommend to Scan and Fix them.

~~~
Wowfunhappy
Fwiw, I had the same problem with Paragon but have had a lot more success with
Tuxera’s driver, if you’re still looking for a solution.

------
bArray
27k lines isn't _that_ crazy. I've merged larger patches, it just takes a
while. This is a very negative view of an open source contribution.

~~~
wtallis
> I've merged larger patches,

In what kind of context? The Linux kernel?

Filesystem code is pretty tricky to begin with, and prone to very subtle bugs
with very not-subtle consequences. And this isn't greenfield development of a
new filesystem, but an implementation that needs to remain highly compatible
with Microsoft's version. This FS driver has to be maintained to track changes
to _two_ operating systems. So _this_ 27kloc can reasonably be expected to
encompass a lot more complexity than your average 27kloc, and it requires a
lot more review effort than something like 27kloc of GPU driver register
definitions.

~~~
bArray
> In what kind of context? The Linux kernel?

Not the Linux kernel, but a large embedded system.

> Filesystem code is pretty tricky to begin with, and prone

> to very subtle bugs with very not-subtle consequences.

This code has been running in the wild for quite a while now, it has had a
trial by fire. And there's no way around testing, subtle ext4 bugs still crop
up despite the maturity of the filesystem.

> And this isn't greenfield development of a new filesystem,

> but an implementation that needs to remain highly

> compatible with Microsoft's version.

No amount of code review will stop Microsoft from adapting their version.
Also, I doubt Microsoft themselves will change too much about the filesystem
given the compatibility they themselves have to maintain with cold storage
NTFS drives.

> This FS driver has to be maintained to track changes to

> two operating systems.

You make it sound as if Microsoft have a hand in any of this. Also, have you
seen the state of the current NTFS driver? It's a bit flakey (no disrespect to
the maintainers).

~~~
wtallis
> No amount of code review will stop Microsoft from adapting their version.

Way to miss the point. Code review for the kernel isn't just about verifying
that the code currently works. It's also about making sure the code is
_maintainable_. Microsoft is relevant here because their actions will increase
the maintenance burden of any Linux NTFS driver. Kernel developers rightly
need to be concerned about how difficult it will be to extend the NTFS driver
to handle new NTFS features that Microsoft introduces.

~~~
bArray
> Way to miss the point.

The point wasn't so clear, but I see what you're saying now. Maintainability
is normal code review though.

> It's also about making sure the code is maintainable. [..]

> Kernel developers rightly need to be concerned about how

> difficult it will be to extend the NTFS driver to handle new

> NTFS features that Microsoft introduces.

Maintainability is one thing, extensibility is another. Preparing your code to
implement some changes completely outside of your control seems like a waste
of time and something that might bite you later on.

------
beervirus
27k is 27 kB.

This is 27 kLOC.

------
Tade0
I solemnly swear never to complain again about the size of PRs given to me for
review.

It's going to take months to throughly assess the quality of this.

------
Havoc
Ouch. That looks really useful, but at the same time possibly their first
kernel submission perhaps?

Hope I’m wrong but I think they’ll be fighting initial bad impression for a
while despite good intention

------
adamretter
I hope the it's better than their ext2 drivers for macOS. I have had nothing
but problems and data loss and their support is the worst.

~~~
jug
Same on Windows regarding data loss. Terrible and I’ll never use their ext2
drivers again.

------
aaron695
> "It looks as though that with NTFS being surpassed by other more advanced
> file-systems,

Is this remotly true?

Will my grandmother not be using NTFS in 10 years?

~~~
wtallis
Microsoft has been trying to come up with a replacement for NTFS for a _very_
long time. They've had mixed success with trying to extend NTFS with more
advanced features like the now-deprecated TxF. There's little doubt that even
its creators see NTFS as something of a dead-end. Whether your grandmother is
still using it in 10 years depends primarily on whether Microsoft can get its
act together to pick and ship a replacement.

~~~
aaron695
> Whether your grandmother is still using it in 10 years depends primarily on
> whether Microsoft can get its act together to pick and ship a replacement.

So yes she will I guess.

Then it's vital Linux gets NTFS working well.

My partner is not going to use Linux things if every time they try and
transfer the 8k 3D holographic photos of our CRISPR'ed dog learning to spell
to my grandmother it doesn't work on her Holovision.

True story, last month I just lost about 1 in 10 of my media files on my Linux
Share to NTFS issues. So now I run the box on Windows.

~~~
wtallis
> True story, last month I just lost about 1 in 10 of my media files on my
> Linux Share to NTFS issues. So now I run the box on Windows.

Were you seriously running a Linux-based NAS with NTFS as the underlying
filesystem, or did you mean something very different? I can't imagine why
anyone would ever think their choice of disk filesystem on a server—hiding
behind a network filesystem—should be influenced by what disk filesystems are
supported by client devices.

~~~
aaron695
Given my original question about my grandmother has been downvoted it says it
all about the judginess of Linux users ;)

My setup was -

Windows and Ubuntu dual boot gaming PC with a NTFS 4G external hard disk,
Windows network share. Default boot was to Ubuntu.

Torrents running off wireless laptop to the network share.

Amazon Fire stick with Kodi wireless running to network share.

------
r41nbowdash
200loc/h, it's 4-5 weeks of work for a single person

------
PhoenixRobo
Does it need to be included in the Linux Kernel? I understand that NTFS is 27
years old and mostly to support Microsoft file systems. Can't the driver be an
external download?

~~~
EE84M3i
In-tree drivers are the standard for Linux kernel modules.

------
londons_explore
As long as it builds as a module, uses only the standard filesystem apis and
doesn't have code changes outside that, code review is far less important
IMO...

~~~
segmondy
I wish you weren't getting downvoted, but what you said is absolutely correct.
If it's a module and separate from core, who cares. If it's modifying core
kernel or mixed up with kernel space code, then nope.

~~~
wtallis
Linux isn't a microkernel; a filesystem accepted into the kernel source tree
will run as kernel space code whether or not it's compiled as a loadable
module. So all 27k lines need to be audited for security purposes and to
ensure they're interfacing with the rest of the kernel only in the approved
ways, because there aren't a lot of technological barriers to the filesystem
misbehaving.

But more important than that is the maintenance burden. NTFS will be around
for a long time, but it's also a moving target because Microsoft hasn't
replaced it yet. Kernel developers have to keep in mind how this code will
look in a few decades, after all the original developers are retired. If it's
written in a very different style from other Linux filesystems, will there be
anyone left who both knows enough about the workings of the Linux IO stack 20
years hence, and understands Paragon's code conventions?

