

Weave – The Docker Network - ferrantim
https://github.com/zettio/weave/

======
jpgvm
I hope all of these Docker overlay networks start using the in-kernel overlay
network technologies soon. User-space promiscuous capture is obscenely slow.

Take a look at GRE and/or VXLAN and the kernels multiple routing table
support. (This is precisely why network namespaces are so badass btw). Feel
free to ping me if you are working on one of these and want some pointers on
how to go about integrating more deeply with the kernel.

It's worth mentioning these protocols also have reasonable hardware offload
support, unlike custom protocols implemented on UDP/TCP.

~~~
shykes
I would love to take you up on that. I want to bake vxlan support into Docker
upstream (as an optional plugin, like everything else).

Edit: Hi Joseph! Just realized it was you :)

~~~
kijiki
If you're going down the path of VXLAN support in Docker, I'd love to talk.
The company I founded built a Linux distribution for commodity hardware
switches that can do VXLAN encap/decap in hardware at 2+ Tbit/sec. The same
configuration that works in a Linux container host or a hypervisor works on
the switches.

nolan@cumulusnetworks.com

------
t0mas88
This looks like a great idea. For me this was a missing piece two months ago
when playing with Docker.

However I have strong doubts about the network performance, not only the
overhead of the UDP encapsulation (that should be quite small), but mostly the
capturing of packets with pcap and then handling them in user-mode. Looks like
a lot of context-switches, copying and parsing with non-optimal code paths.
Are there any benchmarks available?

My feeling is that this will consume large amounts of CPU for moderate network
loads and thus be unusable with most NoSQL kind of systems that benefit from
clustering across hosts?

~~~
weavenetwork
re benchmarks...publishing some is on the TODO list. See
[https://github.com/zettio/weave/issues/37](https://github.com/zettio/weave/issues/37).
Informally, weave is pretty fast but it's not saturating Gbit Ethernet. As you
say, capturing with pcap and handling packets in user space carries an
appreciable overhead.

We've got some issues filed to look at pcap alternatives and also generally
aim to improve performance.

re suitability for NoSQL clustering... depends on where the bottlenecks are;
if you want to cluster for HA rather than scale, i.e. there aren't any real
bottlenecks, then weave will work well. Same if you want to cluster because of
CPU or memory bottlenecks. If, otoh, networking is the bottleneck then adding
weave into the mix isn't going to improve matters.

~~~
t0mas88
Ok, good to know. I think the challenge in taking another route than pcap is
that you would need to do complex tricks with the existing network stack.
Because if I understand the way Weave works you would really only need to do
processing at the beginning of a connection and for some ARP requests etc
while you don't need to do anything to existing TCP streams apart from
encapsulating and forwarding?

~~~
weavenetwork
> complex tricks with the existing network stack

To retain the essence of how weave operates, this would likely not just be
complex but impossible, short of kernel hackery.

> you would really only need to do processing at the beginning of a connection
> and for some ARP request

Weave needs to look at every Ethernet packet. Well, the headers at least. It's
a virtual Ethernet switch. It doesn't even really know about IP, let alone TCP
streams. See [https://github.com/zettio/weave#how-does-it-
work](https://github.com/zettio/weave#how-does-it-work)

------
thu
This seems very nice. What would be the pros and cons of using Weave instead
of Tinc ? I have used Tinc for a while[0] and, the end result looks very
similar (i.e. there is not a nice command-line tool dedicated to use Tinc with
Docker, but the high level description match).

[0]:
[https://gist.github.com/noteed/11031504](https://gist.github.com/noteed/11031504)

~~~
weavenetwork
Just having a look at Tinc.. one difference is completely non-technical - Tinc
is GPL but Weave is Apache licensed so aligns better with the whole Docker
ecosystem. More comments to come.

------
ferrantim
These are the same people who built RabbitMQ.

~~~
hintjens
RabbitMQ has always been an extraordinarily good piece of software, from the
very first versions. I think this is very good news for anyone using Docker.

------
netcraft
as someone who is more developer than ops, I feel like the docker stuff is
still changing fast and that the way you would use docker today will be very
different a year from now; but that containers seem to be the way of the
future - if I have no pressing need to change my server architecture does it
make sense to wait for things settle or would it be more beneficial to get in
and learn now and experience the changes and why they were necessary?

~~~
ipedrazas
precisely because you're a developer you whould embrace docker with open arms.

Because docker removes all the trouble of running applications that you need
for your development: databases, application servers, queues...

I love the fact that I can focus on my code and not on all those details that
stole so much of my time.

~~~
pas
I'd like to see a post/writeup (or even an essay!) on how to do the magic
"backing services" ([http://12factor.net/backing-
services](http://12factor.net/backing-services)) with Docker. And that's where
I find Deis and other Docker orchestration systems lacking very much.

Sure, you can run MySQL in Docker, but it's a far cry from running it on
native xfs with aligned partitions and whatever fancy you feel configuring.
And since docker containers are very reusable, whereas backing data is by
default should be persistent, my impression is that it's too easy to
accidentally remove a docker container.

~~~
vidarh
Nothing is stopping you from running MySQL in Docker on native xfs with
aligned partitions: bind-mount whatever partition you want into the container
by defining a volume in Docker.

This _will_ be persistent, and _will_ survive when you destroy the container.
I use this to e.g. share a /home directory between a dozen experimental dev
container I use to run my various projects - each container ensures I keep
track of exact dependencies for each individual project, while I get to have a
nice "comfortable" swiss-army-knife container with my dev tools and all all
project files.

I _also_ run a number of database containers which use volumes where I bind
mount host directories to ensure persistence so I can wipe and rebuild the
containers themselves without worrying about touching data.

~~~
wastedhours
Am intrigued (and, again, showing my current early-stage understanding of
LXCs), can you link the same data store container to multiple application
containers? As in, have both a beta application and production application
pulling data from the same core DB?

And do you simply define the container as a volume to ensure it stays
persistent? That was the feeling I got from the docs, but again, might just be
flagging how little I know at the minute...

~~~
TheDong
docker run --name=my-data -v /host/data:/container/data data-container

docker run --volumes-from=my-data app-beta-container

docker run --volumes-from=my-data app-prod-container

That would share the data store, however the real way you'd do this would
be....

docker run --name=my-data -v /host/data:/container/data data-image

docker run --volumes-from my-data --name my-database database-image

docker run --link=my-database beta-app

docker run --link=my-database prod-app

Doing --link will allow those two containers to network-communicate and you
should only be communicating with your database over the network anyways.

------
grkvlt
This is really interesting. I've been looking for a way to build in support
for networking between Docker hosts in my clocker.io software, to simplify
deploying applications into a cloud hosted Docker environment. I'd been young
with adding Open vSwitch, but am going to try weave as the network layer in
the next release. Will there be any problems running in a cloud where I have
limited control over the configuration of the host network interfaces and the
traffic they can carry, such as AWS only allowing TCP and UDP between VMs?

~~~
weavenetwork
TCP and UDP is all you need. (And the UDP can actually be quite broken, just
not completely)

We've created weave networks spanning hosts on EC2, GCE and local data
centres.

------
greenimpala
Err can anyone spot the tests in the repo? I cannot.

------
zobzu
What this really means security wise:

[http://i.imgur.com/Cko02do.png](http://i.imgur.com/Cko02do.png)

~~~
cschneid
Each `W` node in that graphic is a different physical host. So 3 kernels.

------
saryant
Question for weavenetwork: are containers addressable by hostname from other
containers? Is there a good way to do that? I didn't see anything about it in
the readme.

I suppose service discovery is out-of-scope for this project but having some
sort of weave-wide hostsfile would certainly simplify it. Am I
misunderstanding the project?

~~~
weavenetwork
Weave itself does not provide addressability beyond IP. That is the situation
now, but this area is very much high on the agenda for us - service discovery
is definitely _in scope_ for weave.

Meanwhile, two points of note:

1) In weave the IP addresses can be much "stickier" than in other network
setups, i.e. a moving a container from one host to another can retain the
containers IP. That means it is quite amenable to relatively static name
resolution configurations, e.g. via /etc/hosts files.

2) Since weave creates a fully-fledged L2 Ethernet network between app
containers, name resolution technologies like mDNS that rely on multicast
should work just fine.

So, in summary, while weave currently does not have any built-in service
discovery, existing solutions and technologies for that should be relatively
easy to deploy inside weave application networks, until weave itself grows
these capabilities.

~~~
SEJeff
Perhaps you'd consider looking for the best way to "weave" weave into consul?

[http://consul.io](http://consul.io)

~~~
weavenetwork
We are certainly aware of consul, and have indeed been thinking of weaving
weave into it. Would love to see an experiment along those lines, if there are
any volunteers.

------
brazzledazzle
Has anyone compared this to rudder ([https://coreos.com/blog/introducing-
rudder/](https://coreos.com/blog/introducing-rudder/))?

~~~
weavenetwork
The most significant conceptual difference is that in rudder sub-nets are tied
to hosts. So containers on different hosts will always be on different sub-
nets. By contrast, in weave containers belonging to the same application
reside in the same sub-net, regardless of what host they are running on. In
other words, weave makes the network topology fit the application topology,
not the other way round.

------
GrantNelson
Yikes, this looks scary. Just because you can do something doesn't mean you
should. Networks are finicky, perf is king.

------
baq
can you compare this with openvpn, or any other vpn if we're at it?

