

Comparing Filesystem Performance in Virtual Machines - zandi
http://mitchellh.com/comparing-filesystem-performance-in-virtual-machines

======
mrmondo
I'm not sure that there's much value in this article.

1) It doesn't really compare benchmarks on OSX vs Linux - you're dealing with
different drivers, kernels and filesystems.

2) Then there's the matter of EXT3 - there is little to no value in
benchmarking such an old filesystem with modern hardware, not to mention that
EXT3 lacks native trim support thus major impacting performance over time
(This is assuming the host OS is also using EXT3 and thus not passing through
the discard IOCTL to the block device).

3) There's no mention of what Kernel version was being run, modern kernels
(3.6+) are significantly more efficient with both disk and network I/O.

4) How much memory is being provided to the host and guest VMs and how much of
the 'benchmarks' are being cached? What kind of disk / filesystem caching are
they using? What is the IO scheduler for both the guest and the host machines?

5) What 'benchmarks' were actually run? I'd bet that it involves using dd
which is by no means a benchmark - nor can it be trusted (especially without
what commands were even being run) - if you're going to benchmark disks use
fio or bonnie++ (although I think fio is more useful)

Here's a good article on how to perform a few kinds of useful benchmarks:
[https://www.binarylane.com.au/support/articles/1000055889-ho...](https://www.binarylane.com.au/support/articles/1000055889-how-
to-benchmark-disk-i-o)

Beyond that, I'd also look at using PGBench to get an idea of the IOP/s in
real world scenarios:
[http://www.westnet.com/~gsmith/content/postgresql/pgbench-
sc...](http://www.westnet.com/~gsmith/content/postgresql/pgbench-scaling.htm)

------
geerlingguy
I have been benchmarking different usage scenarios for sharing files between
hosts and VMs for the past couple years, and for the majority of use cases,
the only way to get near-native performance without much hassle is to use
Vagrant's rsync support instead of native shared folders or even NFS, which
exhibits some jitter making it annoying at times.

See: [http://www.midwesternmac.com/blogs/jeff-geerling/nfs-
rsync-a...](http://www.midwesternmac.com/blogs/jeff-geerling/nfs-rsync-and-
shared-folder)

If your project doesn't have hundreds or thousands of files, or if you don't
need as fast filesystem performance on your host machine, there are other ways
that might give more convenience. But I stick with rsync/rsync-auto for 99% of
my VMs, whether using VirtualBox or VMware.

------
notacoward
One point that might be hidden in the OP is that hypervisors are often
configured to lie about whether data has been committed to disk. This is
certainly true of most public clouds. As a distributed file system developer I
run a lot of I/O tests in the cloud, and I've gotten many results that are
impossible to explain any other way. People I know who work at some of these
providers have practically admitted it. In a cut-throat industry, any provider
who actually did the right thing here would get hammered on performance and
price (because they'd be unable to pack as many instances onto each host as
their competitors do). It's something to be aware of when you run any data-
intensive application in a public cloud, or when you're configuring hosts in a
private one.

~~~
falcolas
I don't know if it's possible to nail down a vendor for comment, but are we
able to back this up? It would be good to know for cases like running an ACID
database.

------
bretkoppel
Note that these are from January, 2014. YMMV.

