

Beyond stunnel: High-speed, secure connections across the public Internet - jbrendel
http://blog.vcider.com/2012/02/beyond-stunnel-secure-high-speed-network-connections-in-the-public-cloud/
The standard ways to secure connections across public links (even for applications that don't support encryption themselves) has been to use stunnel or OpenVPN. But those solutions come with a significant performance hit. This article presents measurements and comparisons to illustrate this and presents a more modern solution with much better performance characteristics.
======
jbrendel
As you can see, after publishing this article I received some feedback and
questions about the comparison, including from the author of stunnel himself,
Michal Trojnara. Even though the stunnel website itself states that the
default is ‘no compression’, this apparently is not so. It appears iperf’s
default data seems to be highly compressible, thus heavily skewing the
performance numbers: stunnel was performing very different work than native
networking or vCider. To arrive at more realistic numbers, I used a large
image transfer (a JPEG) instead, which by its nature is not much more
compressible. I transferred this file with iperf (which can use file input) as
well as wget, The results? stunnel is much more comparable with both the
native and the vCider networking speeds.

Interrupts and context switches are now roughly the same for all three
solutions. stunnel still exhibits a significantly higher CPU load (20%), but
certainly does not max out the CPU anymore. I suspect that the higher numbers
of context switches and interrupts result from iperf’s default behavior of
sending as much data as it can in a given time interval. And since stunnel can
easily compress iperf’s default data, iperf was able to send a lot of this,
which also explains the results reported by iperf.

While I maintain that a setup consisting of multiple nodes is much easier to
maintain with vCider – which also provides a number of other interesting
features – it must be noted that stunnel does indeed perform very well for
point to point connections. Note to self: Be sure not to use synthetic data
for performance tests like this.

------
strags
One question that isn't in the FAQ that I would instantly be concerned about
is exactly what encryption mechanism is used, and where the private keys are
stored. Specifically, are they stored in vCider's database?

If my goal was to set up a secure tunnel, I'd be incredibly wary of using a
closed source solution like this. Even if you're not worried about the
government secretly demanding access to your keys, you might reasonably be
concerned about vCider getting hacked.

~~~
jbrendel
Indeed, the keys are centrally created and (briefly) stored. However, they are
also changed frequently and historic (older) keys are not kept on record. So,
if someone intercepts a bunch of your traffic and then wants to hack or demand
our database, they would have to hurry since the older keys are not kept
around.

However, I understand your concern. We are thinking of ways to address this in
an even more comprehensive way.

------
mtrojnar
Testing results indicated that stunnel was much faster than the line speed
(and thus faster than his own product), but the author simply ignored it. In
fact his version of stunnel had DEFLATE/ZLIB compression enabled by default.
In order to prove that his own product is better than stunnel, the author of
this test decided to compare the bandwith of an uncompressed stream encrypted
with his product with the bandwidth of a compressed stream encrypted with
stunnel.

~~~
jbrendel
Well, I contacted you privately to discuss this so that we may both enlighten
each other on what's going on, but you did not respond to my email and instead
went out publicly. Oh well.

But let's look at it. The physical link was capable of carrying something like
18 Mbit/s. When running iperf through stunnel, it reported more than 400
Mbit/s. But if you look at the actual bandwidth that was used during the
transfer, it was indeed just 4.8 Mbit/s.

So, the 400 Mbit/s is clearly an illusion caused by the compression.
Compressing a stream is a worth while goal, for sure. And it can be helpful in
many cases. But I guess iperf's data is highly compressable. Considering that
much of what's transferred these days is already compressed (multi-media
files), I doubt that in the real world you will see any sort of speedup even
remotely like this.

The fact is this: If it comes to actually transferring data over the wire,
stunnel is very slow and there are just no two ways around it. It attempts
compression at very high cost in CPU cycles and in the end is still going to
be bound by context switches and interrupts.

I'm going to repeat the tests with compression disabled and update the blog
post accordingly.

~~~
mtrojnar
1\. It's you who started the flame by publishing your obviously unfair
comparison.

2\. I didn't receive your email (if you really sent it).

3\. Stunnel overrides the OpenSSL default of enabling compression by default
since version 4.51 released over a month ago
<http://www.stunnel.org/?page=sdf_ChangeLog>

4\. Whether compression is useful or not depends on many factors, including
not only type of data, but also available bandwidth and CPU power. And data
compression is not an illusion. I'd be afraid to use your products if you
don't understand it.

5\. Compression is indeed much slower than encryption. This is a fact. Do you
really mean that your product is better just because it doesn't support
compression?

6\. Stunnel is indeed a performance bottleneck, but only if your internet
connection is over 0.5Gbps, and your server is as slow as my desktop:
<http://www.stunnel.org/?page=perf>

~~~
jbrendel
1\. Sorry you thought a comparison in which your program doesn't come out on
top is a flame. It wasn't. Don't know what's unfair about comparing out of the
box, default install performance of two systems.

2\. Yes, I sent it, sorry you didn't receive it.

3\. I did a standard apt-get install for stunnel on the Ubuntu Oneric system.

4\. Sigh. I'm not going to comment on that one.

5\. Really don't know where you would get this idea. Also don't understand why
you get so upset. As I said, I was more than happy to discuss this with you
ahead of time. Let me try to explain this again: I suggested that most real-
world data is already compressed, thus the benefits you could derive by doing
compression on the wire are lessened to the point where compression won't get
you anything. For any compressable data, compressing it before sending is
great and beneficial and I never stated otherwise.

~~~
mtrojnar
Did you really want to discuss your results "ahead of time"? You could easily
do it before publishing them. My email address is in the manual of stunnel.

I could argue whether an old Ubuntu package is "out of the box" stunnel, or
whether sending 20mbit stream of compressed video is really the most common
use of stunnel...

------
wmf
As usual, someone's pretending that IPSec doesn't exist.

~~~
jbrendel
I'm not pretending this. But consider that IPSec doesn't exactly have a
reputation for being easy to set up or even is particularly well supported
across IaaS provider networks.

People revert to stunnel (and OpenVPN), exactly because they may not have the
knowledge or inclination to get a fully featured IPSec setup going.

------
pedrocr
Are the two doing the same amount of encryption or is vCider using a much less
complex cipher? Is the CPU difference really just in the kernel vs userspace
implementation?

tinc (<http://tinc-vpn.org/>) would seem like a more interesting comparable
than stunnel since it sets up a p2p VPN that routes all IP traffic instead of
just a point-to-point link.

~~~
jbrendel
vCider uses AES 256 encryption.

Tinc looks interesting, I will test that as well.

Considering the huge difference in context switches and interrupts, I don't
think that encryption induced CPU load is the only issue here.

Kernel stuff on its own doesn't just magically run faster, of course. But in
this case, I think it's the constant interaction between user-space and
kernel-space, which causes the problem. That's an issue that will impact any
user-space solution to a networking problem.

~~~
pedrocr
And what encryption is stunnel using? AES was picked to be fast so it may very
well be outperforming whatever stunnel uses.

Can 1200 extra context switches and ~6400 extra interrupts per second use up
100% cpu? In fact you mention that for stunnel most of the CPU was in
userspace and not the kernel which would indicate time spent actually using
the CPU instead of doing context-switches and interrupts, which I assume top
would show as "sys".

I also find the extra interrupts strange. I wonder if vCider is sending bigger
packets and what caching/latency implications that might have.

~~~
aidenn0
Also, I had a bitch of a time getting performance out of stunnel, it seemed to
ignore hardware acceleration units that openssl was happy to use.

------
waldo2k2
What version of OpenSSL was used on the test machine? That would greatly
effect stunnel's performance. See here:
[http://vincent.bernat.im/en/blog/2011-ssl-benchmark-
round2.h...](http://vincent.bernat.im/en/blog/2011-ssl-benchmark-round2.html)

~~~
jbrendel
Version 1.0.0.e

