
Kernel – Fix cluster_write() inefficiency - protomyth
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/cf297f2cccd7fd102041ebf174af66a31a5ef5ce
======
re
Looks like HAMMER is DragonFlyBSD's filesystem, for those (like me) who may be
unfamiliar with the distro:
[http://www.dragonflybsd.org/hammer/](http://www.dragonflybsd.org/hammer/)

Previous discussion about the filesystem vs. ZFS:
[https://news.ycombinator.com/item?id=8829194](https://news.ycombinator.com/item?id=8829194)

~~~
walkingolof
Not really fair to call it a distro, you get the wrong impression, its was
forked of FreeBSD 2002 or 2003

~~~
roninb
Distro is synonymous with OS at this point. No one is using it as a derogatory
term for OSes that aren't different enough to give them the denotation of
"OS".

That said, you're going to have to come up with an argument stronger than
"this was forked ages ago" to suggest Dragonfly and FreeBSD are different
enough to warrant thought. Ubuntu forked before 2004 and you don't see anyone
being called out for comparing it to Debian.

~~~
laumars
> _Distro is synonymous with OS at this point._

Possibly to the layman but the term "distro" (short for "distribution")
deliberately exists to differentiate between different OS's and different
_distributions_ of the same OS but with a different software stack and default
configurations.

While you do sometimes get differences between the Linuxes in terms of init
daemons and slightly patched kernels etc, they are all generally GNU/Linux -
ie generally share the same common Linux OS fundamentals (eg GNU coreutils).
There will be exceptions to this rule (isn't there always?!) but we're talking
about the common desktop / server platforms people automatically talk about
when discussing "Linux".

However many of the BSDs are developed in insolation. While they may have a
shared heritage, the kernels have matured into something quite different from
one another. Often the core utilities and init daemons et al can vary
noticeably as well.

So in essence, FreeBSD and Dragonfly are different but similar OSs, whereas
Linux distributions are the same OS but differently configured. This is why we
make the distinction between "distro" and "OS".

~~~
_delirium
I've been experimenting with Debian/kFreeBSD lately, and one funny aspect it's
had is making me realize that the kernel is not really all that important to
the userland experience, at least for what I do. It still feels like Debian,
even though they swapped out the Linux kernel for the FreeBSD kernel. That's
one reason I think of Debian as an OS, rather than a Linux distro. Debian
itself then has several distros, such as Debian/Linux and Debian/kFreeBSD.

~~~
laumars
I get your point, but aside the package manager and associated tools, most of
what you've described are GNU packages / cross platform user land. ie stuff
that even in the most limited sense was never Debian specific to begin with;
but often was designed to run on most Unix-like platforms anyway. Because of
this I've read some people describe kFreeBSD as GNU/kFreeBSD (ie GNU user land
with the FreeBSD kernel) and I think that makes some sense if you're following
the GNU/Linux naming convention (which I think makes sense in the context of
this discussion because it also takes the Hurd kernel into account -
GNU/Hurd). But this is one of those edge cases I was thinking of when I talked
about exceptions to the rule in my grandparent post.

As for whether it's classified as a separate OS from Debian (Linux) or FreeBSD
- there's definitely some room for interpretation so I'll hold off from
passing my own personal judgement :)

Talking about rule exceptions, another good example would be Android. Largely
the same kernel as GNU/Linux and some of the user land too but equally it's a
very different platform to "desktop / server Linux"

~~~
acdha
One big difference is that Debian has invested a huge amount of work over the
years cleaning up the software they package to make it easier to customize,
maintain configuration across updates, etc. That's much broader than the
percentage of GNU utilities in Debian's userland.

~~~
laumars
Debian aren't the only ones that do that however patching 3rd party software
doesn't make it "Debian's user land".

Debian also runs a lot of software written by Redhat (eg systemd) but that
doesn't make it a distribution of Redhat. Ubuntu ships a lot of in house
software as well (eg Unity) but that doesn't mean ArchLinux with the Unity DE
turns Arch into a distribution of Ubuntu.

What you're doing is akin to classifying groups of web browsers by the
websites the user visits rather than by the rendering engines they're built
on.

~~~
_delirium
I find it more integrated in Debian than in most other distributions (even
including the BSDs, once you get outside their base systems and into the wild
west of ports). It's not just that they ship other software in the package
manager, but that it's heavily customized to fit into the "Debian way" of
doing things. This is partly emphasized by the lack of a distinction between
"base" and "ports"; every package, from libc to your mailserver, is a Debian
package, and is supposed to follow policy and work how users expect a Debian
package to work.

It does vary package to package, but on average there's a fairly substantial
amount of work that goes into "Debianizing" a package so it's integrated into
the OS coherently, vs. the more lightweight build-scripts that you find with
systems like Slackware or FreeBSD ports, that typically ship something closer
to upstream. I would compare it more to maybe halfway towards how FreeBSD
adapts third-party software into its base system. FreeBSD base is full of
third-party patched stuff, too, like ZFS, LLVM, OpenSSH, and sendmail, which
is periodically synced with upstream. But they put enough effort into
customizing it and making it work together coherently, that I think it's fair
to call the base FreeBSD install an "operating system".

I do think Debian as an OS may get less true with systemd, though. SystemD is
really going all-in on the idea that there is a specifically Linux way of
integrating an operating system, which is closely tied to the kernel, and
Debian increasingly finds it difficult to avoid getting pulled in 100% to that
path, since the amount of divergence you need to avoid doing things the
"systemd way" is growing rapidly. In which case the end game is that there's a
Linux/SystemD core OS, of which Debian is just one distribution. While until
now Debian has aimed at being a "universal operating system" not tied to any
specific kernel.

~~~
laumars
FreeBSD is a distinct operating system.

I don't agree with your feelings towards Debian simply because I've used
plenty of platforms that have the same cohesive feel. But at the end of the
day all you're talking about now is an emotional impression based on anecdote,
which may feel relevant to yourself but ultimately has little impact to the
discussion of distributions Vs distinct operating systems.

------
jemfinch
Reading the diff, I'm neither surprised that this bug existed nor will I be
surprised when similar bugs arise.

There must be a more readable to write even low-level code like this.

~~~
koverstreet
Yes, there is. That code needs iterators.

~~~
yoklov
but most of that code isn't iterating. the only parts in that function i see
that _are_ iterating ([0] and the similar loop below it) seems straightforward
enough to me.

i don't see how they would help here unless i don't know what you mean by
iterators (assuming c++-style iterators or similar).

[0]:
[http://gitweb.dragonflybsd.org/dragonfly.git/blob/cf297f2ccc...](http://gitweb.dragonflybsd.org/dragonfly.git/blob/cf297f2cccd7fd102041ebf174af66a31a5ef5ce:/sys/kern/vfs_cluster.c#l1294)

------
remy_luisant
Couldn't they dump out the list of all individual write commands sent to the
disk and have seen that the writes are in a suboptimal pattern? This would
have been one of the first things I would have done when testing a file system
of any kind: Make sure the thing writes what I want to be written and in the
way I want it to be written.

~~~
koverstreet
You don't even need to do anything that hard. All you need to is run basic
benchmarks and compare the results to what other filesystems can do, and when
yours is about half as fast that should make you go "huh"...

~~~
rurban
Hammer is twice as fast, not half. So huh

~~~
sleepychu
So it'll now be 4x?

~~~
rurban
We need benchmarks, and this is the old version only.

------
microcolonel
Always wondered where the bandwidth was going when I was using HAMMER. Had to
use UFS when testing last time.

------
HugoDaniel
How often does a sequential write happen in an SSD ?

~~~
pmontra
As often as it happens with HDD, I think.

Anyway, with my SSD:

I unzipped a 300 MB Raspbian lite zip into a 1.3 GB image yesterday and
extracted parts from a 8 GB video with mmpeg last weekend. I expect all of
them to be sequential given enough free space on the SDD (same for HDDs.)

I also work with docker and vagrant and building those images at half speed
can't be good.

------
Upvoter33
wow, this is an embarrassing bug. To say it doubles bandwidth makes it sound
like an optimization, when in fact it is just (finally) doing what you think
it should be doing. Yikes- how long was this doing this and no one noticed?

~~~
kzrdude
Data loss and corruption is always a lot more embarrassing. Did they have
tests that ensure correctness (in data) of the slow version? Then they were
already delivering on quality of the file system.

------
josteink
So if I'm reading this patch correctly it's now writing things in blocks
instead of as single bytes? Is that it?

And, well, yes. Ofcourse that does wonder's for throughput. And I would
certainly hope that someone who writes a file-system knows that.

Edit: Re-reading the patch it seems like what's being done here is taking a
(failed) at attempt at writing in blocks and actually making it write in
proper block-sizes. Yay C, I guess.

~~~
TheDong
Wow, somehow you managed to read the patch without reading the commit message
that explains what actually changed.

No, you didn't read it correctly.

~~~
josteink
Well yes actually. The fault is completely on me.

But in my defence, the code and diff is all colourful and highlighted... While
the patch description is "hiding" very subtly in plain sight at the top.

When someone is new to any kind of system, their eyes will follow what
attracts their attention. I didn't even notice there was a commit message
there until you told me. Some minor UI tweaks could probably improve that page
greatly :)

~~~
kzrdude
It's amazing how good we are at filtering information at all levels. I agree
with you that the design of the page plays a large part. GitHub probably had
more time & reason to optimize this.

