Hacker News new | past | comments | ask | show | jobs | submit login
The Linux kernel MultiPath TCP project (multipath-tcp.org)
124 points by TallGuyShort on Apr 19, 2013 | hide | past | web | favorite | 69 comments

Multi-path TCP is not only about throughput, but also about providing redundancy to improve stability. And to me, redundant stream is more attractive.

It happens a lot when I am with my iPhone at boundary of a Wi-Fi network coverage, the network is very unstable with oscillations between Wi-Fi and 4G. The phone stays in the status with connected Wi-Fi but it's actually not working. If there was multi-path TCP, mobile devices can establish simultaneous streams on both Wi-Fi and 4G. When Wi-Fi is not responsive, the data stream can be automatically offloaded onto 4G seamlessly without causing latency in application layer.

Another protocol that may achieve this is SCTP[1]. And it's been around for years[2]. But migrating existing services to another transport protocol is a challenge, which makes it difficult to adopt SCTP. I'm curious how multi-path handles this. Is it an extension to TCP that can handle backward compatibility with TCP, or yet another new transport protocol?

[1]: http://en.wikipedia.org/wiki/Stream_Control_Transmission_Pro...

[2]: http://www.sctp.org/implementations.html


EDIT: More details about its design decisions and goals: http://multipath-tcp.org/data/MultipathTCP-netsys.pdf

The tech talk (http://www.youtube.com/watch?v=VWN0ctPi5cw) explains how MPTCP can be implemented on top of existing TCP without interrupting the client, the server, or both.

with two antennas (antennae ?) simultaneously broadcasting/receiving, i think battery life would be pretty low...

Not necessarily antennas. Antennas are in physical layer, not transport or network layer. 802.11n can have multiple antennas but if you are at the boundary of the coverage, multiple antennas won't help much. Multi-path TCP is more about multiple transport streams. Of course it would require separate antennas for Wi-Fi and 4G but that's already the case on current mobile devices. So nothing changes in this sense. It won't affect battery life much from aspect of antenna power consumption because you can only set 4G to backup link, and MPTCP won't offload to 4G until Wi-Fi is not responsive. If you are using full MPTCP (using two links all the time), then that's the tradeoff between performance and power consumption.

In my experience a data link takes far more power than wifi, so the extra use won't be very notable.

I think that the opposite is true (but depends on the type of connection). Mobile stations are focused on low power, they were designed that way from the beginning (e.g. they constantly try to reduce the transmit power if you have "too good" reception so it will be just good and use a lot less power) unlike wifi.

I remember the first mobile prototypes that were used around Motorola were dying of battery after few hours if used wifi instead of mobile data (that was about 8 years ago - so probably we are talking GPRS here).

That's not exactly how this works; you can just maintain simultaneous connections over both network methods. This consumes a very small amount of data on both, but will intelligently shift the bulk of data between the two as connectivity/packet loss/etc changes. It's about being able to seamlessly go from one to the other and back again without having to halt and re-open TCP connections every time, not about using both connections full-out all the time.

Checkout the tech talk if you want to know more.

Bugs have antennae, radios have antennas (or aerials).

Since the site is having trouble:

MultiPath TCP (MPTCP) is an effort towards enabling the simultaneous use of several IP-addresses/interfaces by a modification of TCP that presents a regular TCP interface to applications, while in fact spreading data across several subflows. Benefits of this include better resource utilization, better throughput and smoother reaction to failures. Slides - explaining MultiPath TCP - are available in .pdf and .pptx format. You can also have a look at our Google Techtalk about MPTCP. The IP Networking Lab is implementing MPTCP in the Linux Kernel and hosting it on this website for users, testers and developers. For questions, feedback,... please contact us at the mptcp-dev Mailing-List



Maybe the site should be using MultiPath TCP?

It is using MPTCP. We are just having right now an extremly high visitor-count. The server is only running on a tiny XEN-VM and thus it becomes overloaded.

Has MPTCP been tested with CoDel?

If you use Multipath TCP with a single interface through a router, it will behave like regular TCP (doing NewReno congestion control scheme)

Seems like this is extremely wireless carrier friendly. Once you're able to do transparent tcp handoffs your ability to exploit unlicensed spectrum to offload traffic from your saturated licensed bands without a very obvious service interruption gets much easier. If you deploy devices set to auto associate with a whitelisted set of provider and partner hotspots as soon as the connection is up you can offload the bulk of it, letting you serve more customers with the same amount of spectrum. Addressing heavy demand locations with wifi is way cheaper than minicells or DAS systems and wouldn't involve any regulatory approval or backhaul requirements.

I had to laugh when I saw the charts of LTE+wifi beating wifi alone in throughput. Even if the phone os was setup to try to exploit that, any wireless provider able to tie their own shoelaces is going to monitor multipath traffic like that and once they infer your alternate link is reasonably fast they'll start intentionally congesting their side of the path to force the bulk onto the local route.

I wonder what sort of additional profiling this could enable. Right now my phone is on 3g and my dual homed wifi network. It seems possible that being able to see my traffic from wireless, wireline ip4 and via my ip6 tunnel provider at the same time along with each path's latency would leak more data than existing hard handoffs.

It will probably be usable from the perspective of a mobile carrier, but really, the hard part is the 'transparent tcp handoff' part. They already know all too well how nice it is to offload traffic to wifi. Two problems exist, and that most carriers don't have the servers or the wifi nets to support it nicely, and that mobile phones doesn't support automatic and transparent handover in a nice way (iOS 5 brought support though, so it is not all bad. Android support is mostly lacking though).

Also, wifi already tend to be so much better that the bulk of the traffic would already go that way if possible without carrier interference.

This is the main feature of the design, transparent handoff smoother than anything today, without any existing tcp connections dropping and faster throughput recovery. They demo skype handing off from 3g to wifi keeping the call alive and only suffering ~2seconds of delay during transition.

The project has a number of working kernels for various android phones available.

Carriers would widely deploy ISM nets or whitespace if the handoff was painless and your streaming video worst case lagged out a couple of seconds.

An important difference between using Multipath TCP versus other techniques for 3G/WiFi offload is that Multipath TCP would be a completely end-to-end solution. The handover is entirely controlled by the smartphone and does not require any cooperation from the network. Existing 3G/WiFi offload techniques assume that the network operator controls both the WiFi and the 3G network. This is sometimes the case, but even if your 3G provider as several WiFi access points, there are probably many more access points that are not controlled by the 3G operator and that Multipath TCP would enable you to use.

There are levels of transparency. The level you are talking about still require the wifi to either be entiredly open or to have the authenticaten details stored in the phone manually. The level I'm talking about is where your phone automatically authenticates using the sim card when you happen to go near an access point, and then manage to do a handoff with no manual steps involved.

I'm not saying multipath TCP will be very nice for mobiles, but it will mostly be home and work networks that will be used, just like how it is for wifi use in the mobiles today.

As for 2 second handoff being smoother than anything else out there today, that is just not true. The situation for phones doing handoff with EAP-SIM/AKA is pretty similar if not better.

For what it's worth that's two seconds between when one signal was unexpectedly cut, detected, shifted to the other, windows scaled up for a mulimedia stream and the app buffered enough to decide to start playing audio again.

Something considered a two second handoff is going to result in a far longer stream interruption than that in any method I've seen used especially considering the app is not assisting with the process or getting state transition info which it could be if built with mptctp aware libs.

I wonder how difficult it would be to set up a MPTCP gateway/VPN, so that you could get the benefits of MPTCP all the time without requiring MPTCP support from the server you are connecting to.


    +--------+              +---------+          +---------+
    | Smart- +----WiFi----->| Gateway +---1Gb--->| Youtube |
    | phone  +-----3G------>|         |          |         |
    +--------+              +---------+          +---------+
                  MPTCP                   TCP

It looks like you can do that with a proxy approach, there are active ietf drafts concerning proxies:


Encapsulating tcp/udp inside it probably doesn't work very due to conflicts between the layers of buffers and congestion control.

Yes, a proxy solution is definitely a way to go, but to help the deployment only. In the end we shouldn't need those.

There's also a Multipath TCP patch for FreeBSD 10-CURRENT:


Slides with more info:


This is a great short summary, followed by the Tech Talk (on youtube) which goes into a bit more depth about the how and why of the design considerations.

What isn't mentioned is basically, because the change is 10KLoC to the kernel (unprecedentedly huge change for just the functionality they added) it hasn't even been submitted yet, and mainline hasn't even considered it yet.

I'd love to see it hit mainline, or parts of it start getting submitted, but the biggest thing will probably be re-writing the whole thing to have a much smaller impact on the kernel, while still providing the functionality they need (e.g. use the TCP backoff instead of their own, etc).

What advantages does this have over existing link aggregation implementations?

Edit: one advantage I can think of is you don't have to ask your users to set up a bond0 interface, for example. Any other advantages?

The main advantage is: the congestion control! With MPTCP you could deal with congestion at the transport level. Meaning, if you have three flow MPTCP will select the best (less RTT, space in the win, etc..) link to send datas, where the bonding is just balancing with a XOR or a round robin. This is definitely a big advantage, not only in datacenters but if you have a 3G and a WiFi connection. More about the congestion control: http://tools.ietf.org/html/draft-ietf-mptcp-congestion

Current layer 2 link aggregation is effective for lots of connections with different source/destination pairs because the switches have limited resources and require simple schemes. It works fine for public servers and transit aggregation, but it's very bad at local aggregation to speed up limited connections like trying to bond multiple 1gig ports between san/nas and a server for example.

Exactly. With Multipath TCP, a single TCP connection can benefit from multiple interfaces, which is not the case with bonding techniques. See http://multipath-tcp.org/pmwiki.php?n=Main.50Gbps for measurements showing Multipath TCP aggregating 6 10 Gbps interfaces.

I've never used link aggregation, but it seems that link aggregation would require routers/switches support. And it can be disabled. But this MPTCP assembles multi-path streams using normal TCP connections, which are enabled on all devices that works with Internet. I guess the advantage here is MPTCP is more practical.

Link-aggregation(LAG) is typically only done with links of the same speeds. MPTCP instead seems to be focused specifically on links of differing speeds. LAGs are definied by such standards as IEEE 802.1ag and are typically done at layer 1.5. They are a link level protocol. MPTCP is a layer 4 protocol and so does not require the directly connected device to be aware of it.

It's not restricted to a single hop; it provides redundancy over completely different paths. As demonstrated, they used ethernet, wifi and a cell radio, then removed any two of them while keeping the packets flowing. That's pretty impressive.

LAG/MLAG operates at a lower layer (datalink) than IP and TCP (transport).

Because of that, LAG/MLAG can support more than just MPTCP and, in fact, the two aren't mutually exclusive.

Are there some more detailed docs I haven't found? I have a dozen questions about how their protocol reacts to common problems.

It seems like the kernel's just treating it as a "dumb" tcp connection that rx's and tx's on more than one interface. In this case you're taking the problems existing tcp connections have and multiplying them by the number of links. SCTP works around these problems and makes both the connection and the delivery of arbitrary messages more reliable, though seemingly at a heavy cost to performance.

There are several related RFCs[0], and the IETF's "Data Tracker" page for the MPTCP[1] working group has a lot of great documents as well.

[0]: https://www.google.com/search?q=multipath+tcp+rfc [1]: http://datatracker.ietf.org/wg/mptcp/

MPTCP creates multiple independent TCP subflows and distributes the data among these subflows. The receiver can then reorder the packets thanks to the additonal data-sequence number space.

Awesome Project! I'm curious how this relates to the Mobile IP Standard (RFC 5944). Is Multipath TCP intended to Augment Mobile IP? Replace it? Not related at all? I would be very interested to hear from the Developers / Researches how they envision Multipath TCP working with (or replacing) other Standards.

I would also be interested in hearing what needs to be done on the Client End for Multipath TCP to be supported; For example, say I was to build a Linux Server with the Multipath TCP enabled Kernel (to serve my App), would the Clients (likely Mobile Smartphones) also need to run an O/S with a Multipath TCP enabled Kernel? I'm assuming the answer is Yes, but I'm kind of hoping I missed something and the answer is No. Because if this is the case, then both Android and iOS would need to support Multipath TCP (which means this may not become mainstream for quite some time).

Also, I noticed the referenced PDF said the Technology was already being Licensed (with a Non-Disclosure Agreement). This seems to imply it will not be Open Source. I checked the Git Repo, and don't see anything about this. Are the Developers / Researchers related to Multipath TCP intending to Open Source this?

In any event, Thank you for this release, something fun to play with!

1. The server and the client must be MPTCP-"aware", so the anwser is yes (but look, someone is talking about a proxy up there, it's a solution to avoid this problem).

2. I'm in the team working in the Linux Kernel implementation and yes this is open source! http://multipath-tcp.org/pmwiki.php?n=Users.DoItYourself

Thank you!!!

What a wonderful improvement to our old networking design. Can certainly see the benefits of redundancy, but the throughput advantage is huge. It will allow massively high throughput between servers by aggregating commodity hardware, scaling out instead of up. >1Gb server hardware is ridiculously expensive. I've tried interface bonding and the performance sucks. Now I can get a cheap 5Gb link using 5 dirt cheap NIC's. Awesome.

Isn't this solving the wrong problem? Shouldn't IP support packets with multiple addresses?

How do you propose to modify all of the ASICs in routers that are optimized to do lookups on packet headers at the speeds they need to achieve?

Routers today are doing 32x100GbE in a single chassis. That's an investment in hardware that isn't going to go away any time soon.

SCTP already solved this problem. Multipath TCP is an also-ran.

The promise of Multipath TCP is that you don't have to change the applications or the network middleboxes - if the OS on your end and the remote end are both configured to use Multipath TCP, it'll used.

SCTP requires either recompiling applications or using a TCP/SCTP proxy, and it doesn't solve the middlebox or discoverability problems. There's room for both solutions.

That's the clearest case I've seen for not buying "smart" hardware (firewalls, routers, etc) that know any more of the higher levels of the stack than necessary.

I think the parent comment is valid though. If we wind up holding 2 or more TCP connections to multiplex, with failover, we are basically creating an overlay network that will have the same needs as the lower levels. Namely, routing, link cost metrics, fragmentation, etc... This is effectively what IP attempts to add on top of 802.3.

With that in mind, it seems like it would be more faithful to the point (possible, if not practical) to use the Ethernet+IP stack directly on top of two TCP connections (providing simulated physical links), creating a throw-away network segment between two hosts that could re-use existing routing and bonding logic. I don't think Linux could handle that many virtual interfaces or taps, but that feels like the right place for it...

The devices handling 32x100GbE connections are only able to do that because of the very specialized ASICs that have been custom developed.

I'm not aware of any device that is software-based that could come remotely close to handling that about of traffic. Most current PC-level hardware has trouble w/ 2x10GbE links (if they can even support that).

I agree that specialization is often necessary, but I would prefer to think in terms of add-ons to general purpose commodity architectures. For example, GPUs are specialized for the task of 3D math, but their particular purpose is decided by the CPU. Similarly, FPGAs can be dynamically programmed for a special circuit. Neither is ideal, but in hindsight I would have liked to see something similar develop for the task of routing - and the economic pressure applied by not buying ASICs (that break the stack seperation) would have helped (yes we have benefited from them, but we have benefited from taking a specific path - we don't know if there was a better one). Routing probably would have benefited from an earlier and stronger push towards multicore processors - only there was no reason to do it in consumer CPUs because we view routing and computing largely as exclusive domains (even though the Internet's success has come from connecting many general purpose endpoints... Why is the fabric not also general purpose?).

Going forward, I suspect GPUs could also be applied to make routing decisions with acceptable speeds. Have a look at http://www.date-conference.com/proceedings/PAPERS/2010/DATE1... for some trials.

The specialised chips that are used by routers are CAMs and TCAMs. Basically, associative arrays (hashes or dictionaries as they're called in most scripting languages) implemented in hardware. You can program an FPGA to run as a CAM. Mostly what they're doing are repeated lookups of where to switch or route packets (based on header fields). If you're in a CPU world with RAM you can try and optimise the data structure that holds the lookup tables so you get decent speed but with CAM it's implemented in hardware so the lookup times are fixed and known. It's not really a limit of computation that can be helped with multiple cores, the CPUs can be pretty slow. Multithreading or parallel processing is typically handled on routers by distributing the workload to packet processors that handle a single port or a group of ports, all looking at the same CAM (or copies of it in the more distributed chassis-based architectures). Modern routers are basically supercomputers in a box. Of course there's an inherent limit at the PHY level where you can only receive or transmit a single packet at a time on an interface so you can't multiprocess beyond a CPU per port basically which limits the complexity of the design. 1000 cores on a GPU won't help you if you only have 100 ports to route traffic on.

On a typical Intel PC, the MMU's TLB is implemented in CAM as well but it's tiny in comparison to the ones in routers. CAMs are pretty power hungry compared to RAM. They are becoming commodities though and the router manufacturers are getting away from doing custom ASIC designs. Intel bought Fulcrum which makes these packet processors and companies like Big Switch are doing cool stuff with white box network hardware. It's really going to bring the cost of the hardware down pretty dramatically in the next couple years.

I'm looking forward to that!

Uh...these routers aren't "smart". They're doing the bare minimum to get a packet from an input interface to an output interface.

Basically, ANYTHING you try to do at the speed and scale you'd end up needing to use these routers for would be at least as complicated if not more so.

The level of engineering required to de-encapsulate a packet, deal with the L2(Ethernet),L2.5 (MPLS), L3(IP) headers, perform a lookup in the appropriate table (might be more than one), possibly deal with any encapsulations INSIDE that packet (cough-cough GRE) which starts the whole process over again is IMPRESSIVE. You're not going to do that in software, my friend. And anything you build that even begins to look like a general purpose processor will be unable to perform to the degree to which the custom ASICs already do.

NPU (Network Processor Units) were supposed to make this easy but they never took off so now what few remain are doing things like NAT or Application Firewalls, and doing a relatively poor job for the money compared to a reasonable x86 server with appropriate I/O engineering.

tl;dr: you're woefully ignorant about what it takes to make the Internet run fast.

I didn't say it would be fast, only that it seemed like the right place for the specific goals of multipath TCP in order to minimize code duplication and overcome the various "smart" devices (a marketing word, not an accurate description) that pierce layer boundaries (like smart switches that look at IP headers, and routers that look at TCP/UDP flows) and have effectively rendered it impossible to develop new protocol semantics in lower layers (because they won't make it across the Internet; IPv6 could be depolyed by now if lots of ISPs didn't have "smart" devices blocking everything but IP+TCP+some special types of UDP). I blame such devices for making it unnecessarily complicated to develop multipath semantics over the current Internet.

The rest of your comment is pretty insulting. :)

Uh, you don't know what you're talking about. ISPs dont block UDP. And IPv6 was held up by a lot more than hardware. Sorry.

Go read up on how IP routers actually function and then report back with what you've learned.

Source: I've been doing this (networking) since before there was an Internet. I currently don't touch routers that do anything less than 10Gbps/port and really prefer to work with 100Gbps/port. That's a cool couple o'million bucks per chasis, yo.

If you want to try it; it's easy http://multipath-tcp.org/pmwiki.php?n=Users.HowToInstallMPTC...

Also, we have a measurements campaign running, if you want to contribute ;) http://multipath-tcp.org/pmwiki.php?n=Users.AboutMeasures

Whatever happened with SCTP?

Used everyday in 3G/LTE networks to setup the data paths through the network.

The only thing I've ever seen in it was a modification of FTP to use SCTP streams on a single connection rather than have two connections. This would of course make the whole active-FTP-over-NAT business easier, except for the fact that the average NAT device ($50 home router) probably doesn't play nicely with SCTP.

Given the subject matter, I should probably include my usual disclaimer string: I work as a software engineer for F5 Networks. Anything I post here is my own thought and is not intended to represent F5.

When you use a mobile phone, chances are big most of the control plane traffic (call setup, SMS, data channel setup/teardown, roaming signalling, phone registring and so on) travel through an SCTP association somewhere.

It's a shame. Erlang supports it nicely, too. But like koenigdavidmj, I understand lots of cheap home routers don't like it. If you want a protocol that consumers can use without worry, sadly it seems to me that you better base it on TCP or UDP under the covers. Which multipath TCP does, according to https://lwn.net/Articles/544399/

FOr that reason, WebRTC Data channels use SCTP but encapsulated in UDP. It is implemented in user space inside the browser (probably not what you'd want but at least encourages deployment).

[ref] http://tools.ietf.org/html/draft-jesup-rtcweb-data-protocol

I vouch for it being used in a very high-profile, highly concurrent online service with great results.

In this particular case we owned both the client and server-side networking libraries - but all-in-all it worked quite well and helped us deal with NAT traversal issues among other things.

I've never seen it mentioned outside the Stevens book and that it was supported in java 7

In short: hard to deploy because of middleboxes dropping all non-TCP/UDP traffic..

It's pretty slow and relegated to telcos and people who need HA multihoming.

Is this slowness intrinsic to SCTP, or a matter of implementation?

It's always a matter of implementation.

SCTP is the solution to the problem that Multipath TCP is trying to solve. The biggest challenge is the penetration of TCP libraries vs. SCTP libraries.

...and that SCTP has higher overhead/complexity. But 'slowness' is an imprecise term.

Some applications will perform much faster over TCP because the link is short, reliable, and the application only needs to stream data in-line. Heck, even with network loss you can see gains in TCP vs SCTP.

Some applications will be faster with SCTP because of either the nature of how the application handles network data, or the properties of the network they're traveling over. But speed was never a design consideration of SCTP.

If you want speed, use UDP. If you want something that's fast and "just works", use TCP, or this multipath TCP thing. If you want something designed to pass multiple independent unidirectional data messages across a single "connection" that can span multiple network paths and never miss a beat, use SCTP.

Wasn't sure if there was something particular about SCTP that could induce algorithmic latency such as a buffering requirement, setup/negotiation, congestion control, other overhead, etc.

It uses CRC32 as a checksum and that cannot be offloaded to most commercial NICs. This will also have an impact on routers that manipulate the transport layer ports due to checksum recalculation cost (which is trivial for the checksum used in UDP and TCP)

Routers almost NEVER manipulate the transport unless they are doing something very un-Layer3 like NAT or some other encapsulation.

What I especially like about this is that the project was developed in an academic context while being hugely beneficial to society. I mean, improving the performance of a fundamental protocol like TCP has huge effects. If only we had more academic research like this one! Btw, also cool that it was developed 15 minutes from my house.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact