
Introducing iSCSI over TLS - srust
https://www.blockbridge.com/introducing-iscsi-over-tls/
======
pjc50
IPsec has always been tremendously hard to set up and use. The use of TLS
seems like a sensible modern choice.

However, I've never been clear on when someone would use iSCSI other than
between nodes in the same cabinet. You can't share storage at the block layer,
so the two systems are necessarily tied together exclusively. For WAN use some
sort of S3-like blob store would always seem like a better choice?

~~~
fulafel
Shared block storage filesystems exist:
[https://en.wikipedia.org/wiki/Clustered_file_system#Shared-d...](https://en.wikipedia.org/wiki/Clustered_file_system#Shared-
disk_file_system)

You wouldn't necessarily use this for shared storage though. One might argue
that the standard SAN setup where you count on the network being secure is
outdated now, same as "secure internal network" mindset in general.

~~~
mango77
Sometimes when you think you have a "secure internal network", this happens:
[https://www.washingtonpost.com/business/technology/google-
en...](https://www.washingtonpost.com/business/technology/google-encrypts-
data-amid-backlash-against-nsa-
spying/2013/09/06/9acc3c20-1722-11e3-a2ec-b47e45e6f8ef_story.html)

------
__michaelg
This article a bit fishy to me. E.g.:

> In almost all network stacks, per-flow IPsec processing is serialized due to
> implementation specific constraints of upper level protocols that require
> in-order packet processing and lower level hardware interfaces that are
> optimized for latency. The lack of concurrency, combined with memory copy
> requirements, inline cryptographic operations and multiple traversals of the
> network layer are to blame for poor performance.

How is that different for the TLS solution? You still need somewhat in-order
processing of the TLS stream, memory copies won't really go away with an
additional TLS proxy, and the network layer traversals also aren't really that
different. Sure one can put work into optimizing all of that, but that appears
orthogonal to using TLS to me.

What follows it the usual collection IPSec management downsides -- which may
be true but shouldn't affect performance.

IMO the interesting part is pretty far down the post:

> Additionally, TLS record sizes are often larger than MTU (16KB vs
> 1500/9000B), resulting in lower setup costs for encapsulation.

This may be the actual important change that affects the throughput. It would
be interesting to see the effect on real workloads, i.e. not in synthetic
benchmarks. Vice versa, it would also be interesting to see the same
benchmarks with TLS record sizes limited to the usual MTU sizes.

As the saying goes: I'm not impressed if your prototype is slightly better
than what we have in production for years.

~~~
bbjnicklin
Michaelg,

Regarding the network, the paths are different in a meaningful way, you might
want to trace the paths for yourself to see how:
[https://upload.wikimedia.org/wikipedia/commons/3/37/Netfilte...](https://upload.wikimedia.org/wikipedia/commons/3/37/Netfilter-
packet-flow.svg). Specifically, with IPSec, you make another pass through the
network layer, after hitting XFRM in the protocol layer. This is not the case
for a normal flow. This “loop” is significant because of the frequency with
which it happens. Also, as the post states, random memory accesses are the
killer. This additional loop and the logic within exacerbate the problem.

Also, If you’ve been using IPsec in production for multiple years, it’s
possible that you using non-AESNI optimized ciphers, so the speedup could in
fact be MUCH greater. If you can share a bit about your specific deployment,
we would be happy to provide guidance.

------
theonewolf
Are you really the first to use TLS in securing the storage fabric?

I believe Seagate's Kinetic had TLS features quite awhile ago:
[http://www.seagate.com/tech-insights/kinetic-vision-how-
seag...](http://www.seagate.com/tech-insights/kinetic-vision-how-seagate-new-
developer-tools-meets-the-needs-of-cloud-storage-platforms-master-ti/)

Both of which are reminiscent of CMU's NASD research:
[https://en.wikipedia.org/wiki/Network-
Attached_Secure_Disks](https://en.wikipedia.org/wiki/Network-
Attached_Secure_Disks)

New things that _are really cool_ from Blockbridge:

1\. Opening up WAN access to storage devices via TLS

2\. Kernel drivers (for all major OS's?) allowing storage devices over the WAN

3\. Re-imagining how we access block devices

Opens up the possibility for EBS-like storage across the WAN which is, simply,
amazing.

~~~
bbjnicklin
theonewolf, to the best our knowledge, we are the first to offer TLS protected
iSCSI (a block storage transport). As we said in the post, SSL was considered
back when the iSCSI protocol was developed, but IPsec won out. Many people use
TLS today for object storage (ie. S3, Kinetic, etc.). That said, this post is
less of a “pitch” and more of a “here’s what's possible with modern
technology”. Regarding NASD, If I recall correctly, that was a file based
research project.

BTW, regarding the drivers: none needed, you can do it from userland on any
platform with a simple SSL proxy like stunnel. If you want, reach out and we
can hook you up with software to play around with.

------
glastra
The latency comparison seems a bit crappy to me. Comparing the lowest points
in two series of data is very error prone.

~~~
bbjnicklin
Glastra. Maybe this data is a bit better for you, direct from fio.

    
    
      TLSv1.2: ecdhe-rsa-aes128-gcm-sha256
      
      QD1: (groupid=0, jobs=1): err= 0: pid=2140: Sat Mar  5 14:59:49 2016
        read : io=2308.5MB, bw=78791KB/s, iops=19697, runt= 30001msec
          slat (usec): min=2, max=21, avg= 2.42, stdev= 0.50
          clat (usec): min=39, max=56179, avg=47.66, stdev=126.45
           lat (usec): min=46, max=56182, avg=50.14, stdev=126.45
          clat percentiles (usec):
           |  1.00th=[   46],  5.00th=[   46], 10.00th=[   46], 20.00th=[   47],
           | 30.00th=[   47], 40.00th=[   47], 50.00th=[   47], 60.00th=[   47],
           | 70.00th=[   48], 80.00th=[   48], 90.00th=[   49], 95.00th=[   49],
           | 99.00th=[   51], 99.50th=[   51], 99.90th=[   53], 99.95th=[   56],
      
      
      IPsec: aes128-gcm96
      
      QD1: (groupid=0, jobs=1): err= 0: pid=2442: Sat Mar  5 15:27:04 2016
        read : io=1902.6MB, bw=64938KB/s, iops=16234, runt= 30001msec
          slat (usec): min=2, max=29, avg= 2.39, stdev= 0.52
          clat (usec): min=52, max=57367, avg=58.51, stdev=140.69
           lat (usec): min=57, max=57369, avg=60.97, stdev=140.69
          clat percentiles (usec):
           |  1.00th=[   56],  5.00th=[   56], 10.00th=[   57], 20.00th=[   57],
           | 30.00th=[   57], 40.00th=[   57], 50.00th=[   58], 60.00th=[   58],
           | 70.00th=[   58], 80.00th=[   59], 90.00th=[   60], 95.00th=[   61],
           | 99.00th=[   63], 99.50th=[   65], 99.90th=[   78], 99.95th=[   82],

~~~
feld
latency is especially important for small I/Os, but there are a ton of things
that can be done to improve latency in general at an OS and network level

~~~
bbjnicklin
Feld, you are totally right. In fact, there has been discussion in the Linux
community about improving IPsec performance. However, as the post states, the
difficulty in doing so is the API contract in the kernel. That said, even with
a perfectly optimized IPsec stack, having to process each packet individually
will always be slower than processing a logical record that is a multiple of
packet size.

------
julie1
Oh my. coupling the instability of stateful connections at congestions and all
the problem of atomicity/non corrpution of data on the physical layer? It
smells like a predictable catastrophe for the future. Not only in terms of a
critical path for correctness that is narrower and more sensitive to the noise
of the internet, but also in terms of costs. Why do people accept to trade off
stability and security for ... what?!

Well. Yes a robot could be juggling efficiently while dancing on a ball. And
it would be a nice technical success. And IPSEC/TLS are a ball.

But, why? Can anyone show me a use case where the costs worth the benefits
with an actual business explication?

Not just, it is there, so we are gonna use it, lolz. Progress is anything that
change, so deal with it we are gonna make a regression and call it progress.
:DDDDD

Does every high tech people forgot about money, budget, resources, and
contracts?

Bro-tech is tiring.

If you want to synchronize filers for your recovery plan, you don't have to
separate them from 1000 km. You can put fiber and use level2 routing with
physically protected equipment (level 0 security). You can even if you can
stop your activity for 8 hours have your HD backup being delivered by plane in
another country.

Ciphering costs money.

