Hacker News new | past | comments | ask | show | jobs | submit login
Frivpn – A multi-threaded OpenVPN client (github.com)
91 points by payne on Feb 20, 2018 | hide | past | web | favorite | 54 comments

This is a very interesting project as all our today's processors, even the Raspberryian ones, are multicores.

I assume the author does not want to rewrite OpenVPN from the scratch. Because "Multithreading", "networking" and "cryptography" immediately raises the head bells when it comes to C. People write this in Rust nowadays, isn't it? ;-)

I’m not sure I would want to do crypto multithreaded in any language. Too easy to leak information via timing even with a single core.

Can you clarify what you mean? You seem to imply that using multiple threads is not a good idea, but then also express that not using multiple threads is also not a good idea. What's left?

I think encrypting independant packets on each thread is fine, what I believe he is refering to is encrypting a single packet on multiple threads. One of the many angles of attack against cryptography is side channel attacks, where for certain algorithms the path taken by the code depends on the value of the key/data. In which case you can leak some information by measuring the time it takes or power usage. Modern algorithms are carefully designed to not be path sensitive, which is hard enough when the algo executes on a single thread, but becomes really complex to manage on multiple threads. Just more opportunities to shoot oneself in the foot.

That's a fair argument, however we also want to exploit computational ressources. It's nothing new that clock speed stagnates but everything points to manycore architectures. Parallel crypto maybe hard but it is somewhat inevitable to research, investigate and implement.

Clock speed does not necessarily equal computational speed. I remember how ~15 years ago people predicted that single-core performance was not going to improve any longer, and that we would get more and more CPU cores over time.

My first multi-core machine was an Athlon64 X2 with two cores running at 2GHz. Today, my work laptop (ThinkPad L540) has a dual-core CPU (plus 2-way SMT) running at 2.6 GHz (3.2 GHz with TurboBost). But single-core performance has improved notably. Not as much as was common in the 1990s, but still. And for most desktop systems[1] more than two or four cores seem to be pointless, anyway, since they will sit idle most of the time.

[1] I am not talking about developer workstations, CAD machines or other high-end use cases, but the kind of machine where some office drone runs Outlook, Excel, and a web browser.

Most of this single core improvement, if I understand correctly, comes speculative execution and larger caches (rather than shorter pipelines), which -- in the context of crypto -- means that implementations go to great length to avoid the features that speed up execution, specifically BECAUSE they leak information.

Fair point.

Now, I know just enough about cryptography to know I know nearly nothing about it, so maybe someone can put this into context: Wouldn't a multi threaded crypto implementation open up a whole bunch of new potential attacks?

Good argument, but it is probably limited to the x86 architecture (and its contemporary grandchildren). In contrast, the OP explicitely mentions the Raspberry PI with its ARM CPU. This kind of RISC chips (more or less) secretly took over the world since virtually all mobile devices are equipped with such a SoC. It is common knowledge that these CPUs don't deliver the single core power of a desktop CPU but nevertheless parallelize. Without doubt it is worth in getting crypto fast(er) on such devices.

Fair point, I had not considered ARM and other low-power CPUs.

How many of those cores turbo boost to 3.2GHz though? Isn't it just one, even on the i5-8xxx laptop chips?

Probably. But I rarely push that computer to the point where it uses more than one core, except when I compress large amounts of data.

I think I understand that, but wouldn't you think that adding an "unknown fuzz" by sharing computation on 1-or-many threads makes it harder for an external party to do timing attack?

The attacker now has a lot more possible outcomes depending on the number of threads handling that packet. I seems, on the surface, as if more threads add complexity to the attacker.

Debugging multithreaded code is much harder than single threaded, especially because of all the race conditions that can happen: it makes our brains burn hot from all the possible outcomes from identical runs run one after the other. My intuition would lead me to believe that this complexity is also now added to the attacker's side - but perhaps I'm missing something.

A general rule of thumb is that any added complexity to crypto algorithms makes the attackers job easier in the end as accounting for all of the side channels becomes much harder.

Timing attacks are somewhat annoying and noisy to conduct over the Internet, but you can measure the time a process takes with a resolution down to ~15ns[0]. I bet there is all sorts of cross core cache errata that leaks more time than that.

[0]: https://www.cs.rice.edu/~dwallach/pub/crosby-timing2009.pdf

He's saying that it exacerbates an already complicated issue.

I think it's more interesting than that. This is a mixture of C and Lua code!

Looking at it, I can imagine redoing it as LuaJIT and using ljsyscall to deal with the system calls on all platforms. Might be even more snappy.

All the heavier processing is done in C already. Lua is only used for handling dis-/reconnects, rekeying and the likes. Hence I don't think LuaJIT would improve the performance much in this particular case.

> ... redoing it as LuaJIT...

is luajit supported on versions beyond 5.1 ? just wondering...

This is a strangely worded question. LuaJIT is compatible with PUC Lua 5.1. It will thus not run (without problems/tweaking) run code that is designed for Lua >= 5.2. However, the differences are not really big in practice. It is rather more likely that the new options (especially the FFI) that come with LuaJIT will lead to the most work when switching over. Not that they would be absolutely needed, but it just makes it all so much smoother and faster - which is probably the reason to switch to LuaJIT in the first place.

what a strangely worded response :)

all i can get from your response is a 'yes' to my earlier affirmation that luajit is compatible with Lua 5.1 honestly, responding to gp's comment, i also found lua's embedding in this application quite interesting.

imho, for something which might be living for a long time, considering using lua-jit might not be a good idea.

It's like the three major areas of programming where "things may appear to work but are in fact dangerously broken in hidden ways" converged. Compared to other software where the worst case is often crashing or not working, unproven crypto code bases might in fact appear to work (as in, traffic does appear to tunnel correctly), while in fact it could be broadcasting your private keys and plaintext data to anyone who comes knocking.

The rust people seem to shy away from rolling their own crypto.

Beside the non-zero amount of effort required, nobody wants to be the one responsible for bad crypto. Can't say I blame them.

My goodness, you Rustlings. There are lots of people out there perfectly able to write safe and sound C code even handling crypto, networking and multithreading.

It's well established that writing safe and sound C code is an impossible feat (even OpenSSH had a number of memory corruption bugs in the past). Some people are better at it than others, yes.

That doesn't mean we should drop everything, throw away our C code and rewrite ot in Rust, but the issue is real.

> It's well established that writing safe and sound C code is an impossible feat

You make it sound like this is somehow particular to C.

I hold the opinion that when you're using such an open definition of "sound and safe" (e.g. code will never do anything that could reasonably called a "bug" from a user's perspective) it is impossible to write "safe and sound" code in any programming language.

But that is not really a problem with the language, but with the definition of the term "sound and safe".

If you use a more constrained definition of "sound and safe"; one that would actually allow you to write "correct" code in a language, then C code can just be as sound and safe as rust code or Coq. It really just depends on your definitions.

In a way I agree with what I believe the grandparent tried to say; one thing that consistently rubs me about the Rust community is that they oftem seem to be using a very constrained definition of "safe" when advertising that rust is a "safe language", but a very open definition of safe when proclaiming that more or less every other systems language is "unsafe". That may be true using their definition of safe, but is a bit of dirty PR trick.

> ... throw away our C code and rewrite ot (sic) in Rust...

well, to me at least, it seems a bit premature to compare the two languages at this juncture. rust is quite nascent, and doesn't have enough data points (imho) to make an objective comparison.

perhaps, when we have large code-bases available in rust doing safety+security critical applications (which might then be worthy targets) it might make more sense.

Did you notice the smiley ";-)" ? It indicated irony.

I don't know if you are aware of it, but "Fri" means "Free" in Swedish so I thought you picked a rather interesting name for your project. Looks cool!

Is it pronounced like the German Frei or the English Free?

English Free, also Danish Fri is the same, but the end of the e sound gets swallowed a bit.

Is the ‘r’ in ‘fri’ in Danish and Swedish pronounced as a trill (Russian ‘Ру́сский’), tap (Spanish ‘pero’), or an approximative (English ‘red’)? Is it dental, alveolar (Spanish ‘pero’) or postalveolar (Hanover-German ‘Rachen’)?


Click the little play button under the respective headings:

Danish: "fri uttal på danska [da]"

Swedish: "fri uttal på svenska [sv]"

Norwegian Bokmål: "fri uttal på bokmål [no]"

The site doesn't play sound for me, not even with activated JS, cookies and DOM storage.

I managed to ask someone who lived in Sweden. They said it is pronounced as an alveolar, short trill.

Is this the linguist counterpart of trolling? :)

Anyway I don't know, it is definitely not the English r however.

I was not trolling in any way. I'm dumbfounded. Why would you think I were trolling?

sorry, it just seemed absurdly overspecific, wanting to know if r is pronounced like any of multiple pronunciations of r that you would have no reason to expect I had any familiarity with. It just seemed like a linguistic joke of some sort, like are you putting me on with this stuff? However - https://www.youtube.com/watch?v=lCScmkh8hQg

Thanks for the video! I often painted what one has to do with their vocal tract to illustrate how to create a certain sound, and I also use a finger in the mouth to show people how to pronounce the 'ch' in the german 'Ich', for if you try to pronounce the english 'sh' but have a finger in your mouth on the tip of your tongue you automatically produce the german 'ch'.

But instead of painting positions and stuffing things in mouths, one can also exchange information about this by just using text. That's what I did in my comment. So, no joke whatsoever. Just a specific question about how to pronounce a specific sound exactly.

This is what it sounds like:


But that doesn't help if the server isn't multithreaded, no?

Also these days I switched to IPsec. I just get way better performance than with OpenVPN (I am talking about 600 Mbit/s vs 350Mbit/s). But it is quite a beast and you need to have some basic cryptographic knowledge for IPsec.

EDIT: Seems like my first question is only relevant if we are talking about powerful CPUs on both sides. If multicore on client <= single core on server, then this client makes sense.

One thing I hate about OpenVPN is its not really an Open standard, and more Open Core then open source.

This is probably the reason its not included by default in a lot of OSs whereas macOS, iOS, and I think Windows all have built in support for IPSEC. Yeah you can download an app, but there is no really good (free or paid) app for mac and the official iOS client sucks.

This is pretty neat. I looked at the current limitations, though, and a lot of them are things that damage performance, or greatly limit compatibility (TCP-only, no client certs, lzo compression required, only one choice of algorithm). I expect many of these will get fixed over time.

It was originally written to work nicely with IPredator first, simply because that's our current VPN service of choice. UDP support will probably come, but it actually severely impacted performance (about 30% slower). Trying to squeeze out as much bandwidth as possible, we focused on the TCP part for now.

Since sandy bridge, I thought all CPUs had AES-NI baked into the silicon. Hardware based crypto is much faster than software based so why is core count a concern?

Performance isn't strictly bottlenecked by crypto operations. Latency is a huge problem with VPNs. This means you want packet-per-second(pps) performance to be optimal as well as full duplex packet processing.

Multi threaded means payload that has already been through a crypto operation wouldn't need to block further crypto ops or wait for other crypto ops to be processed further (just one obvious example of a performance advantage). Also - plenty of people don't even use AES,even for those that do AES-NI offloads the actual AES operation,not things like CBC,GCM and CTR.

That may be true for Intel CPUs, but you're forgetting the Raspberry Pis, ODROIDs, APUs et cetera.

Now switch to WireGuard for even better performance.

No Windows support and Android support exists in the form of a Kernel module which effectively means that only those with rooted phones will be able to use it (no iOS support either as far as I can tell). Better performance is nice and all, but it doesn't really help if you have no way of actually using the software.

Is the WireGuard protocol stable? Are there any high performance implementations for platforms other than Linux?

The protocol is stable, only Linux client at the moment, but they are working on releasing other clients "soon" last I heard. I don't have any new update on what "soon" means, but I hope this year sometime, personally.

Is there a high performance OpenVPN implementation at all?

Question not a statement: how does SoftEther look in terms of OpenVPN performance?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact