Hacker News new | past | comments | ask | show | jobs | submit login
Curl’s backdoor threat (haxx.se)
403 points by sohkamyung on Sept 12, 2017 | hide | past | favorite | 150 comments

The bit about code signing is really important. The people who wrote distro package managers recognized this years ago, and ever since have at least tried to do the right thing. That can make getting code into those repos a bit cumbersome, but that's kind of the price you pay.

Unfortunately, most of the language- and environment-specific package managers have pretty much ignored that issue. There's too often no way to verify that the code you just downloaded hasn't been tampered with. Heck, half the time you can't even be sure it's a version that's compatible with everything else you have. It's a total farce.

Software distribution is too important, security-wise and other-wise, to leave it to dilettantes as an afterthought to other things they were doing. Others should follow curl's example, instead of just dumping code into an insecure repo or (even worse) putting it on GitHub with a README that tells users to sudo the install script.

I want to agree with you that it's important, but also I want to emphasize how hard it is to do any better than just HTTPS.

If I have distro maintainers sign the packages they build, then I can avoid keeping their private keys on the package servers. That's cool, and it can make it harder for an attacker to ship evil bits. But there's a bootstrapping problem: How did the user's machine originally learn the packagers' public keys, to know what signatures to look for? It got the public keys that were baked into the installation disk, and that disk was almost certainly downloaded from an HTTPS website, without checking any other signatures. (Or even _with_ extra signatures, if the key fingerprint for those was hosted on the same server.) Compromising the server still lets you own all the users.

At the end of the day, running a whole system of offline package signing and key distribution buys you a kind of subtle difference in security: If an attacker owns the package/image server, they can own machines that are set up after that point, but not machines that were set up before. Is that difference worth the effort of maintaining a system like that? Maybe! But it's not really the difference between "secure" and "insecure".

For me a big difference between package signing and HTTPS is that with HTTPS the private key has to be on the web server all the time, so is constantly exposed.

With a package signing key, it can be held on a more easily secured system, and is therefore less likely to be compromised.

> That can make getting code into those repos a bit cumbersome, but that's kind of the price you pay.

That's as clear a statement of security theater as I can think of.

If you make it cumbersome to build packages for a distro and get them accepted into that distro, your distro then ends up with language-level package managers doing an end-run around that "secure" package-managing apparatus.

Meanwhile, non-dilettante security experts are sandboxing the code that they aren't able to verify. Chrome/Chromium does that. Firefox is doing that more and more. What are the cumbersome-package-having distros doing?

> Unfortunately, most of the language- and environment-specific package managers have pretty much ignored that issue. There's too often no way to verify that the code you just downloaded hasn't been tampered with.

This is something Windows seems to have got right, with Authenticode and Catalog signatures - it's easy to verify who signed a file, and if it's been tampered with.

What options are there for this on Linux? File integrity monitoring (Tripwire, Ionx Verisys, whatever) will tell you when files change, but that doesn't help to know if a file is in a 'trusted' state.

> What options are there for this on Linux?

You have RPM signatures and the like to verify the authenticity of packages at the point of installation. If you mean detection of changes to files after the installation, then we need to talk about what it is you're trying to protect against.

If you want to detect accidental corruption (e.g. clobbering of distro-provided files through pip or 'make install'), the built-in verification functionality of RPM (rpm -V) or dpkg (dpkg -V) does a good job at that.

Malicious modification is something else entirely. Your system files are only modifiable by root. So if an attacker was able to mess with these files, any tool intended to detect that situation might've also been messed with so that it always gives the "everything is fine" assessment.

You can tackle this problem using special hardware. Or do the assessment on a different system using a disk snapshot.

The former can be done on Linux using a TPM and the trousers framework. If the protected set of software changes frequently, it's a pain in the ass to manage though.

Brooooken. No change installs. Build once ,install once, ro mounts, writes to tempdirs by application, build release audits before push..it's the only way.

Assuming one can trust git and GitHub (and CAs), is there a technical reason why you would consider distributing code via GitHub as unsafe?

In addition to account compromise, there's also the risk of bugs/compromise of GitHub itself:


Commit signing can help to mitigate that.[0] Note that GitHub now offers the ability to add your GPG public key to your profile and show whether a commit is signed with that key or not. I find this more dangerous than useful: if an attacker compromises the account and adds his/her key, and adds a malicious commit, GitHub would show it as verified.

[0]: https://mikegerwitz.com/papers/git-horror-story

Do you know how the signing requirements work on GitHub when accepting a pull request on a repo requiring signed commits, when the pull request is from a fork where someone is not signing their commits? Must the commit to the fork be signed in order for the pull request to be merged, or is it possible for the main repo to merge an unsigned commit while signing it themselves in the process?

I can see requiring every commit on the primary repo to be signed, but it's a larger nightmare to accept pull requests from forks if they are also forced to sign their commits.

I'm not all that familiar with GitHub.

What's ultimately important for trust is that the maintainers (or whomever you are to trust) sign commits. They may choose to pass this responsibility down the line a bit (e.g. how Linus has his "lieutenants"), but if some random contributor does or does not sign a commit, do we care? Are they in the maintainers' web of trust? What benefit does verifying their identity actually have with respect for the project?

So in that case, a maintainer may decide to just review the patches and sign the merge commit.

That contributor may want to _assert_ their identity---e.g. have their signed commit committed to the repository to show that they actually did that work---but that's a different issue.

There are several reasons, but not specifically to do with code signing. Code in a distro repo has been at least cursorily checked to make sure the install script (or "scriptlets" in something like an RPM specfile) doesn't do anything awful. Some of that's automated, some of it's manual, but at least it's there. An exploit would have to get past both the author and the distro gatekeepers to become operative. With code on GitHub, it only has to get past the committer - who might, unlike distro packagers, be totally clueless about security or even basic bash-scripting safety rules. That's just too easy IMO.

Stolen GitHib credentials.

The code is what we are trying to verify, not that someone claiming to be user x published it.

That is basically the same as trusting `apt install`. We just hope apt repo maintainers have higher levels of opsec.

Most trust on a typical Ubuntu install, for example, is still chained back to an TLS download of an ISO (or maybe torrent file). That bootstraps your repo public keys.

You should always be verifying your downloads:


I have done this before. But if you are actually MiTM downloading the GPG sums from the same source as your ISO is a little pointless. That is the problem, how do you bootstrap the trust? It all still just goes back to TLS and the CA.

You can check the sums over time and acquire them using multiple connections to verify they are all the same to gain a higher level of confidence, but this is actually annoying for someone with a high level of technical skill and basically impossible for most people. I only do this for things that require the highest levels of operational security, like, say, if I am setting up a system to sign certificates in a CA or something.

A slightly easier approach is to strip down your CAs to a bare minimum in your browser config, and double check the certificates being presented on TLS download sites. You can still be owned by a MiTM if the CAs actively collude with a nation state and have given them signing keys, but... there isn't much to do about that. The options really aren't that great in terms of really verifying things.

And the security of the developer who signed the package too?

And we are back to Thompson, basically. :)

> We just hope apt repo maintainers have higher levels of opsec.

They do.

Backdoors in software is one of those nightmarish scenarios that disturbs you until you just think about something else and kind of temporarily forget it (like nuclear war, killer bees, climate change or asteroid impact). Open Source just raises the bar but in no way solves these problems (for example OpenSSH's roaming feature/bug, OpenSSL and Heartbleed). In computing one can quickly become paranoid: software buttons for switching smartphones off (are they 'really' off?), always-on microphones, webcams (granted, you can cover it), blobs in your smartphone and home router, downloads from http://download.cnet.com (note the lack of https), winscp, putty from www.putty.org (again, no https, which is not even the actual site, but nevertheless the first result from Google). In Linux the landscape is slightly better, but do you really trust all those packagers? Do you really understand each line of code when you "git clone foo; cd foo; ./configure; make; sudo make install"?, and in X11 you can easily make a key-logger without even being root! [1]. And not even full disk encryption can protect us (Evil Maid [2]).

That's one of the reasons I'm skeptical of the Ethereum smart-contract concept. In theory it works, but in practice I'm not sure at all. The DAO heist was one early example of security bugs in smart contracts, but I fear they will become more common when malware developers turn to "contract-engineering".

[1] https://superuser.com/questions/301646/linux-keylogger-witho...

[2] https://www.schneier.com/blog/archives/2009/10/evil_maid_att...

> https://www.schneier.com/blog/archives/2009/10/evil_maid_att...

The following part of this classic (2009) article caught my eye:

> Symantec’s CTO Mark Bregman was recently advised by "three-letter agencies in the US Government" to use separate laptop and mobile device when traveling to China, citing potential hardware-based compromise.

It is strange how much this changed after the Snowden relevations. In Europe, now you get that same advice when travelling to the US.

Yup. I no longer carry my regular laptop or phone to the US. I have separate devices that receive a wipe before entering and after leaving. Doesn't prevent hardware compromise though, but at least it won't compromise my daily driver.

Even US companies advise their overseas employees to be careful and thoughtful as to what you bring with you, b/c of what can happen at the border: https://github.com/basecamp/handbook/blob/master/internation...

Now all we have to worry about are the attacks from our countries' domestic security services.

You mean by the private contractors they hire.

Reminds me of badBIOS[1]. A security researcher became sure he was infected by a very advanced piece of malware that could hide itself completely. Many people think it was just his imagination.

[1] https://arstechnica.com/information-technology/2013/10/meet-...

And then the intel ME debacle became widely known and, as usual, the tinfoil hatted among us got to say "told you"

Small nickpick:

The tinfoil hats would have loved to say "told you so", but even they had not predicted the true scale of things.

Intel ME is not just some BIOS code, but a separate hardware component that is deeply connected with all CPU actions.

Nitpick to your nitpick:

Real tinfoilers have been warning about this since the 80's

yeah, remember when people where laughed off when talking abut ECHELON?

Yeah, growing up with ECHELON and Carnivore, the whole Snowden thing didn't surprise me at all. You didn't even need to assume they were doing it, past evidence pointed to the fact the government was 100% spying on everyone already.

And do you know what happened to those people?! Nobody knows, of course! They got taken out coz they know to much!

Intel ME was widely known of before BadBIOS. BadBIOS is still nonsense.

BadBios is not nonsense, it uses CPU virtualisation and Hard Drive controllers which appear as USB devices to load first, then runs independent of what ever OS you are running. If your Bios has been compromised then you cant switch off CPU Virtulisation and your Bios looks and feels like the real thing. Any electronic device with chip's that can be updated can store the malware, USB printers, card readers & writers, some keyboards, the list goes on.

Is there any evidence that badBIOS actually exists? From what I heard, the researcher never found any real evidence.

There was no evidence.

I thought that one of the specific BadBIOS claims was that it can jump across air-gapped machines via ultrasonic audio played over the speakers or something.

It's definitely possible to build a full virtualization environment that feels pretty seamless, but that was only a portion of the claim.

I dont know about ultrasonic, but I certainly have a camcorder recording of some high frequency sounds which could be used to jump air gaps much like dial up modems used to make when handshaking. You wouldnt hear the sound in a normal office environment only in a silent office, because the external speakers & amp were turned up close to max but not playing anything which is also not normal for most offices. Its quite likely whilst highly technical those behind this form of attack are not able to deduce what environment the attack is taking place in like a normal office or a silent office.

As speakers can be used as microphones (technology is essentially the same just different ohms and materials used for the cone) and modern motherboards can detect when a 3.5mm jack plug is plugged into a headphone socket, it might be possible to have the speakers acting as a microphone in some situations. Its something I'm still looking into, but I have noticed the some DJ mixes on Youtube will play up ie going quiet when you have headphone's plugged in but not when using built in speakers like those found on a laptop. You can reset the mix going quiet by unplugging the 3.5mm stereo jack, now whether this is some sort of DRM technology being used as some of the DJ mixes will be illegal copies uploaded to Youtube, I dont know yet just like I dont know if these are related or separate events to BadBios. Its not unheard of big corps to employ methods to disrupt illegal copies of music & films, the Sony rootkit on some of their music CD's is one such example of big corporations hacking their customers. https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootk...

Yes, you could use it to jump air gaps. (In fact, that technology is already deployed in production in Chromecast pairing, when your phone is not on the same network as your Chromecast.)

But don't you need something on the other side to receive and decode the data being sent? What's the BadBIOS story for how the infection initially happened?

(Is the assertion that something along the lines of Intel ME is already listening for control instructions over ultrasound?)

The CIA has been known to tamper with electronics, so between that and pre-backdoored hardware (IME) it's fairly likely that a determined opponent has one or more means to passively wait for a payload in a hard to detect manner.

From there, it's turtles all the way down; you "only" need to deliver an ulterior, possibly tailored, payload from any of the several methods described in this thread.

Could those high pitched sounds you recorded just be capacitor whine or noise on the power supply? All cheap sound interfaces in computers produce noise that correlates with CPU, GPU, or bus activity, and many power supplies make squealing sounds that can be heard in quiet rooms. It's conceivable these could be manipulated as side channels in an already compromised system, but they exist regardless of compromise.

That's the beauty of such methods. Anytime an avenue for compromise seems "noisy", I'd be willing to bet someone smart organised and well funded has investigated using it for hi-value penetration.

>you cant switch off CPU Virtulisation

OK, makes sense

>looks and feels like the real thing

Not really, because you can't turn off virtualization

>>looks and feels like the real thing >Not really, because you can't turn off virtualization

To be clear what I mean is, the bios can be made to look and feel like the real bios when in fact it could be a compromised bios which still shows the manufacturers logo, menu options etc etc.

Bottom line is, if someone can make it, someone can modify it.

Only getting a SOIC clip hooked up to something like a RaspberryPi to get a copy of the chip code will you possibly be able to tell otherwise. Something like this might get people pointed in the right direction. http://www.win-raid.com/t58f16-Guide-Recover-from-failed-BIO...

One of the other attributes I have seen with this "suite of malware" is its ability to spread over the USB bus. If hit with it and it stops your usb devices including printers, mouse & keyboard, PS/2 mouse and keyboards still work so you can safely shutdown your machines instead of pressing and holding the power button to force a shutdown which can lose work. Lets face it, when looking at how USB works, its a disaster waiting to be exploited when considering how much attention goes into monitoring ethernet traffic. Having spoken with Tomasz the developer of this tool http://desowin.org/usbpcap/ on windows, you only get what windows will show you, so a separate malicious OS using CPU virtualisation could still interfere with the USB bus.

One file I've isolated when using partedmagic will not display its entire contents in the included opensource hex editor if its accessed using the hex editors file load method, but if you read the block device sector by sector, you can navigate to where the suspect file is stored and then read the entire contents of the suspect file. There seems to be a sort of magic string which prevents further investigation of suspect files including the ability to dd dev zero block devices which is a nuisance.

In all, who ever is behind this is IMO hacking chips so that even if you do a full disk wipe to whatever standard and reinstall the OS, you'll never get rid of it unless you reprogram the chips, it like a complex zero day spread over hardware and software, so in isolation parts look innocent enough but when combined becomes malicious. Its very very clever whoever is behind it and theres not many entities with the resources or knowledge to pull something like this off IMO.

Plus when considering the NDA's that exist with chip/cpu/hardware manufacturers, the knowledge at this level is even more restricted.

If you want to get lucky, dont always follow industry standard practices, its sometimes the only way to spot the anomalies.

Thanks for sharing this. Got any links/further reading related to this mess?

BIOS and firmware viruses have been explored for a long time. That alone is not "BadBIOS" which was a claim about an novel C&C (or if you're extremely paranoid, infection) mechanism using the PC speaker, and an unrealistically robust infection potential. (And to be clear, "BadBIOS" does not exist.)

True but for this argument, I'd say the specific location of the backdoor isn't the relevant part.

I feel like this needs to be repeated over and over. In theory, encryption and smart contracts and bitcoin are mathematically sound and "you can't argue with the math!"

But the theory is written into real programs by humans, who make mistakes. The math is unassailable until, oops, it isn't, because someone forgot a bounds check!

Which is why you use mathematics to write formally verified software e.g. in Coq. :)

This whole "move fast and break things" philosophy should be unacceptable, if you want people to trust in your new cryptocurrency/voting machine/etc, but try telling your investors that it'll take 5 years to develop the software instead of 5 weeks to "a first prototype" whose bugs and bad design decisions will haunt you forever...

Coq extraction is not formally verified.

1) It will be very soon: http://www.cs.princeton.edu/~appel/certicoq/ CertiCoq is a formally verified compiler from Coq to assembly (using Compcerts backends), not just an extraction to Ocaml/Haskell/Scala.

2) Extraction is not the only approach to verifying software with Coq (see Verifiable C, or Bedrock). In other proof assistants, e.g., in Isabelle or HOL extraction isn't even available and so other approaches are common. For a nice example look at the bootstrapping process of CakeML (https://cakeml.org/).

3) Even if it wasn't, the point is that the trusted base with verified software is tiny compared to anything else that people are actually using. "It's not perfect" is not an excuse if it is basically perfect in practice. See the Csmith paper ("Finding and Understanding Bugs in C Compilers", https://embed.cs.utah.edu/csmith/) and what they had to say about CompCert.

I agree, the equation is simple as long as 3 letters organisations have more men power to scrutinize code and find bugs in oss software this model will not work. 3 letters organization don't need to implement backdoors they just need to review existing code and find bugs and exploit them. For most developers writing code and adding functionalities is fun and rewarding on it self, reviewing code is not and this is way we end up with oss projects core teams or maintainers overwhelmed and making mistakes.

And then there are potential backdoors in hardware (or microcode). Supply-chain attacks really are a nightmare.

Suppose you are a medium-sized IT company (large enough to be a valuable target, but not google-large where you can employ a really good security team that knows their hardware in an out). And then think of all the hardware you buy from all different vendors and manufacturers, and make a list of those that could totally own you if there was a backdoor. What can you even do except shudder in horror?

> Open Source just raises the bar but in no way solves these problems

While technically true, yet, you are comparing air travel safety with motorbike safety.


Not really, no. It's a deep problem that nobody has solved. Just because some code is on Github doesn't mean they're running that code, for example.

>putty from www.putty.org (again, no https, which is not even the actual site, but nevertheless the first result from Google)

Doesn't matter on windows. The binaries are authenticode signed.

A better example might have been Cygwin. As far as I can tell, none of their executables (particularly the installers) are signed.

The sort of hypothetical security vulnerability here is likely to depend on undefined behaviour (buffer over-runs, subverting parsers, etc etc). Just another reason to continue moving over to safe languages, especially for the lower level bits of our stacks. HTTP is big and complicated, I'm much happier exposing Rust/Go/C#/... to it than I am exposing C to it.

In safe languages, backdoors must be far more explicit, so we close off the likely scenario posited here.

"Safe" languages make it harder to write some classes of bugs; this is good. I wonder, though, whether it's not a better return on investment to focus on sandboxing wherever possible? I can run curl in firejail right now with no code changes whatsoever. On supported systems, pledge() and SELinux rules can mitigate attacks with minimal effort. And we get to keep existing programs without investing the man-years to rewrite something that works.

These things are excellent, but lots of stuff gets pulled in via libs. Security is a layer and spectrum, both techniques are valid, valuable and needed.

We need to find a way to extend sandboxing to dynamic libraries in a way that is descriptive at the ops level. Something that can be applied a system management rule so that it can be distributed to end users now and not wait for the safe versions to get disseminated.

Would making sure that every software has libraries packaged/signed and in the same folder as would work? It would minimize common shared lib and updates to the same. THough disk space will be wasted should be still acceptable to many use cases

Sandboxing is separate from authentication. Binaries getting compromised is orthogonal to a non-adversarial defect being used by an adversary.

There's a good chance that if you were to rewrite some of today's existing stack in new languages you'd end up with more bugs, not fewer.

C may be awful in some respects but for quality it's going to be hard to beat 15 years of peer review with any new languages cool features.

You are conflating "new and cool language" with a "language that eliminates ton of possible bugs from the get go".

These two terms are far from identical.

You might be masquerading your conservative approach to new languages by hiding behind "C is mature". No it's not. Right now somebody on the planet is introducing a buffer overflow without knowing it, while coding in the "mature C".

Get real already, please. It's high time.

Random example: I dislike Go's error handling but the explicit nature of it has saved me from working half-asleep 50+ times already. Another one: one meager if/else in a supervised Elixir worker saved a server from infinite repeating of a bugged task that would otherwise keep crashing forever. There are others, lots of them. I am sure people can give plenty of examples for Rust as well.

I think you're overreacting. The GP wasn't being conservative about new languages for new projects. S/he was merely warning that rewrites carry their own risks, which might outweigh the benefits of better languages (or for that matter other infrastructure). If you avoid 100 bugs in the new version but add 101 because you didn't completely understand the old code and the environment it runs in, you haven't come out ahead. This phenomenon has been too well known for too long to be blithely ignored.

Me over-reacting is most likely true. Been dealing with people dismissing unquestionably life-improving tech for far too long lately.


I do believe most of Linux userland has to be rewritten though. Be it Go, Rust, Nim, D, doesn't matter much as long as it's a memory-safe language.

> I do believe most of Linux userland has to be rewritten though

Why? I'm sympathetic to the argument that 2017 computing shouldn't be on the basis of 1970s UNIX limitations and mindset, but changing that would require a lot more than just rewriting the user land applications, and would require a bigger re-think.

But assuming that the shell's functionality is OK as it is, what's to be gained in a re-write?

For one thing, piping being mostly text-friendly is limiting in many scenarios I stumbled upon. A modern shell should allow arbitrary objects to be piped and processed, much like the functional programming paradigma. The UNIX idea was and still is wonderful, but we're past the text-only thing.

In any case, I feel (and I don't have tens of facts, I admit) we're dragged by the past for far too long. Others have documented their gripes with the current incarnation of Ops / sysadmin problems much better than I could. Here in HN (but I think years ago).

That makes no sense. There has been a ton of research. C, needing to be backwards-compatible, can only implement a subset of that research. More modern languages, not having this burden, are free to implement the entirety of this research. Therefore, C can only be at most equal to new languages, and overwhelmingly likely worse.

Makes a lot of sense. You have 2 scenarios; 1) old code base in C, rewrite in language du jour. 2) new functionality in either C or spiffy lang 2.0 (SPIFFY).

For scenario 1, all that functionality has to be duplicated, including "defects". Are you sure that the functionality is 100% there or did you miss any use cases? Decision for leaving it.

Scenario 2, new widget. Most likely, coding in SPIFFY will be a better choice.

Run old anomalous traffic on old codebase that can handle the corner cases. Run the new traffic over the safe code that covers 80% of the use cases.

Slowly expand the percentage of traffic coverable by the new safe code.

The 80 percent case can be written and coded in a small amount of time. In curls example, it is fetching a file over http 1.1 using an encoding, possibly following some redirects. Then post and put requests, then chunked uploads, downloads, then?

Need gopher and ftp? Run the old codepath.

If you had a list of all the corner cases from the start, wouldn't it be easy to just cover them in the new code? I thought the point of corner cases was that they introduce bugs because we don't think of them when we design the system.

If you could infer a spec from the existing code, you have at least a 100M product on your hands.

And probably solved the halting problem too ;)

Is that because C is a "good language", or just because rewrites are error prone?

Since we're talking about backdoors, how about compiler ones?

With C, there are several routes to bootstrapping your compiler of choice – there are countless implementations that can be used as intermediates (both closed and open source, for all sorts of architectures, with decades worth of binaries and sources available), and diverse double compilation is a thing.

Rust? Unless you want to go back to the original OCaml version and build hundreds of snapshots (and providing you actually trust your OCaml environment), you've got no choice but to put your faith in a blob.

I'm not against Rust as a language, but it seems counterintuitive to use a language that only has one proper implementation and requires a blob to bootstrap, as a defense against backdoors.

You're referring to trusting-trust backdoors, but I suspect that those should be low on the threat model: they seem like they'd be hard to weaponise in way that they're maintained through years of very large changes (in the case of Rust). Just a normal backdoor of a malicious piece of code snuck in seems more likely, and a full bootstrap isn't necessary, nor does it actually help at all, to stop that. (But it's still true that a single implementation is more risky in that respect.)

This is something I've been thinking about quite a bit. It feels like there have to be two kinds of compilers and VMs (if necessary), with different strengths.

One kind of compiler should be like current compilers, with a focus on speed, resource consumption, optimization. Most actual commercial applications would use this compiler, because it provides the fastest and most efficient software.

But beyond that, it might be beneficial to implement compilers with a focus on simplicity and a minimum of dependencies. For example, implement a compiler on an ARM CPU in assembler. The translation step to run this code on an actual CPU is too small and simple to be backdoor'd, and the CPU should be simple or even open.

Such a simplicity oriented compiler could provide a source of truth, if all components are too simple to backdoor'd.

In cryptography, there's the concept of a "nothing up my sleeve number". The idea is that if an arbitrary constant has to be put into an algorithm, the author should describe exactly how they picked the number (it's the digits of pi, etc) in a way that leaves no obvious room for maliciousness. (In some algorithms, a specially-crafted constant can create a backdoor.)

I'm rapidly thinking of safe languages in the same way as "nothing up my sleeve numbers". Code written in a safe language is much easier to verify that there aren't any intentional or unintentional backdoors put in by the author.

Maybe better, but I have yet to see anyone publishing code for a remotely feature complete Rust/Go/C#-Curl.

"Hasn't happened yet" is not the same as "it's not possible" though.

> The sort of hypothetical security vulnerability here is likely to depend on programmer's inability to write safe C code.


Well, if it was an intentional/coerced backdoor, then it doesn't matter how good of a C programmer the author is. Actually, you could argue the better they are, the higher the risk is that they'd be successful in hiding the backdoor.

> I’m convinced the most likely backdoor code in curl is a deliberate but hard-to-detect security vulnerability

Evil organisations and/or big government agencies are probably working on finding vulnerabilities and using them without reporting them.

That sounds more efficient and impossible to spot or prove, than trying to implement backdoors directly.

This is a very real threat for all software, open source and commercial, and hard to completely fix.

That said there are a number of possible mitigations and the fact that they're not more widespread is, to me, an indication that people who rely on software don't think that this threat is worth the trade-off of the additional costs or time that mitigating it would take.

For example :-

- Requiring packages signed by the developers for all package managers (e.g. https://theupdateframework.github.io/ ) . This would help mitigate the risk of a compromise on the package managers hosting, but we see many large software repositories that either don't have the concept or don't make much use of it (e.g. npm, rubygems, pip)

- Having some form of third party review of software packages. It would be possible for popular packages like curl to get regular security reviews by independent bodies. That doesn't completely remove the problem of backdoors but it makes it harder for one to go undetected. This one has obvious costs both in financial terms and also in terms of delaying new releases of those packages while reviews are done. There are some things which act a bit like this (e.g. bug bounty programmes) but they're not uniform or regular.

- Liability for insecure software. Really only applies to commercial software, but at the moment there doesn't seem to be much in the way of liability for companies having insecure software, which in turn reduces their incentives to spend money addressing the problem.

I'm sure a load of commercial software includes curl or libcurl, but if there was a backdoor in it that affected that sofware, I don't think the companies would have any liability for it at the moment, so there's no incentive for them to spend money preventing it.

tinfoil: why is he suddenly writing a blog post on how curl is not backdoored? does this mean curl is backdoored but he can't say directly that it is backdoored :)

I was at dinner with Daniel the night before FOSDEM kicked off this year, and he said "I always get asked at some point in the Q&A about whether or not curl has a backdoor in it."

The next day, I was due to meet a developer from a vendor organisation I'd been working with, and he came and found me during the curl talk Daniel was giving. Because the talk was ongoing, I didn't really get a chance to say hello to him or anything.

Cue the Q&A section. My vendor contact sticks his hand up, and is the first person to be picked to ask a question. His question "Have you put any backdoors in curl?".

You couldn't make it up.

[edit] I can't spell "Daniel" :(

It's a possibility, but I think he had one talk too many during which he got asked these same questions over and over again.

Since one of the backdoor methods mentioned was introducing a memory safety bug (like a buffer overflow), one way to reduce the attack surface is to use a memory safe language.

The thing is, one can write memory safe code in C. The problem is the difficulty in verifying it is memory safe.

I've opined before that this is why, soon, people will demand that internet facing code be developed with a memory safe language.

>The thing is, one can write memory safe code in C.

One can generate safe C code. We have ample evidence that a human being can't sit down and write it.

Well, even seasoned devs admit that writing memory safe C is always a tough task. But sometimes when you have 512kB of memory, you are bound to make some memory acrobatics.

Immediately you mentioned memory safety, Rust is the language that came to mind. Memory safety is indeed important, but I find it hard to believe that in the future, ~80% of mainstream system tools will be migrated from C/C++ to Rust. Is it possible or has there been any attempts to write standalone tools that are able to check before compile-time that a particular C/C++ program won't have buffer overflows/dangling pointers? I don't expect such a tool to catch everything, but it should at least be able to track most of such bugs.

EDIT: typos

Detecting 100% of buffer overflows with 0% false alarms, would require solving the Halting Problem.

For applications that demand both maximum speed as well as maximum security, the best solution is probably something like a C compiler that requires the code to be accompanied by formal proofs of defined behavior. Even this will necessarily sacrifice speed in a theoretical sense, because there are certain problems for which the fastest solution is safe but can't be proven safe within (PA/ZF/ZFC/insert any consistent foundation of mathematics you like).

Eventually we'll have people writing fast programs and proving their soundness using large cardinal axioms which might or might not actually be true. Then someday one of those large cardinal axioms will turn out to be inconsistent [1] and suddenly some "proven" code will be proven no more.

[1] https://en.wikipedia.org/wiki/Kunen%27s_inconsistency_theore...

Why would you need large cardinal axioms to prove a program correct?

I used large cardinal axioms as [the canonical] example of any axiom stronger than standard mathematical foundations.

A contrived example: there are certain large cardinal axioms that imply the consistency of ZFC. Thus ZFC cannot prove those axioms unless ZFC is inconsistent (Godel's incompleteness theorem). Consider the following problem: "If ZFC can prove 1=0 in n steps, output 1. Else, output 0." A naive solution would brute-force search all ZFC-proofs of length n. A faster solution would be: "Ignore n and immediately output 0." This is correct, because ZFC never proves 1=0. You could formally verify the correctness by using certain large cardinal axioms, but not using raw ZFC.

D is also memory safe, and it has been modified to make it possible to gradually migrate a C program to D.


https://github.com/Microsoft/GSL is an attempt at something like this.

The Core Guidelines have been in development for a long time now; and they do only catch a subset of issues. I'm all in favor of making languages safer overall though!

Frama-C lets you prove all sorts of useful properties about C code, including memory correctness. (Actually writing the proofs is a lot of work though.)


A system that guarantees you full memory safety in C/C++ would probably force you to use programming patterns that are much easier in Rust. At some point you'd essentially be writing Rust code in C/C++ with worse tooling and worse syntax-fit.

Not necessarily. Rust's ownership system is really quite conservative, to the point of not really letting you write doubly linked lists (without unsafe or using indices or something). A memory safety system for C/C++ that lets you prove cyclic pointer structures safe, or even helps you safely implement GC, is probably possible and would be quite cool. We'd probably jump at the chance to use something similar for unsafe Rust, too.

I'm reading a book called 'Nexus' at the moment. Last night I read a part where they are deliberately installing a backdoor in their own system. They do it by modifying the compiler itself to inject malicious machine code into the binary. It's hands down the best technical description of hacking I've ever read in a work of fiction - highly recommended.

Already been done - it was the subject of Ken Thompson's Turing award paper


Your link doesn't seem to work for me, but I presume you're talking about Reflections On Trusting Trust[0]?

[0]: https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thomp...

yes, thanks - looks like the ACM hands out temp URLs

If you like that kind of book check out Daniel Suarez.

Daemon had me hooked, it's great when an actual techie does sci-fi.

Looks interesting. Have you read Charles Stross (developer, sometime HN poster) and Vernor Vinge (CS professor)?

Sometimes I wonder, what would happen if one of these invisible heroes dies ? What would happen to Linux if Linus Torvalds dies? What would happen to curl is Daniel Stenberg dies? For curl for instance, only Daniel can sign a release. So what happens if he is not able to do so anymore? This is just a small example, but you get the idea. There is so much power under under these men that it sometimes gets very scary.

> For curl for instance, only Daniel can sign a release.

I don't believe that's quite true. Anyone can sign a curl release, so can I. It's just that Daniel's key being used to sign a release of curl carries the trust that this is legitimate. If he were to pass away or be somehow incapacitated, another curl maintainer could start signing the releases.

I have a trusted friend and he will receive a mail when I'm gone automatically. Things like this are handled in it.

It sucks to think about it, but I'm glad that I've set up Gmail's inactive account manager this way.

The Linux kernel is a massive project with a web of contributors and maintainers, and it's clear which of the senior level members could step in at any given time.

Big open-source projects have plenty of meatspace to draw on. It's the little projects that 'come from somewhere' that actually only have one or two people 'in the know' that are the ones at risk.

More importantly for Linux, curl, etc., almost nobody uses Linux from Linus or curl from Daniel. You get it from your distributor, and in the case of Linux it usually comes with quite a few patches. These distributors (even the all-volunteer ones like Debian) are projects involving lots of people and clearly defined procedures for what to do if one of their maintainers stops being able to contribute.

A good example is glibc; several years back, a huge number of people were using the eglibc fork, not because glibc upstream (Ulrich Drepper) stopped being able to do releases, but simply because he was refusing patches for architectures he didn't like and other similar changes. Very few end users even noticed that they weren't using "real" glibc. (Ulrich has now stepped down and the eglibc changes have been merged back in.)

Thank you for this! Will always keep it in mind!

This is why servers /network should be configured to reject/prevent out bound calls by default, only allowing connections from a white list.

Was there ever a big, significant backdoor in any widely used open source software?

I don't mean a bug like heartbleed, but an actual intentional backdoor.

Still completely theoretical though, at least as far as I know.

And has a counter anyway: https://www.dwheeler.com/trusting-trust/

It only has a counter assuming you have trusted compiler.

Do you trust yourself? You can choose whatever compiler(s) you want as the 'trusted' one(s), even writing your own. It doesn't even need to be a very good or complete compiler.

What do you compile the trusted compiler with?

Who says it needs to be compiled? Your trusted compiler can be a Basic script.

The dissertation likely addresses all your concerns. Sticking with the Basic example, say you wanted to see if tcc was vulnerable, but you don't trust gcc, icc, borland, clang, etc. either as compilers to use directly or as compilers to compile your trusted one.. And you don't want to write your own in Java or Python because you don't trust the VMs. Whip out an Altair BASIC binary from the 70s and write your compiler with that, just enough to compile tcc. Perform DCC. If the tcc source does not correspond with its binary, you'll know.

It could simply be my lack of in-depth understading of curl, but wouldn't curl make for a pretty weird target for a backdoor? It doesn't serve content or remain running for long periods of time, does it?

No it doesn't, but what if you can be convinced to use a compromised version of curl that downloads a compromised nginx? This is the threat model that we are looking at here. That's why we should always verify signatures for packages we download, to make sure we get the right bits.

But it's used a lot and makes network connections. If it had a remote-code-execution vulnerability in how it handled connections, then it could provide an easy path for man-in-the-middle attackers to get into systems.

"If one of the curl project members with git push rights would get her account hacked and her SSH key password brute-forced, a very skilled hacker could possibly sneak in something, short-term. Although my hopes are that as we review and comment each others’ code to a very high degree, that would be really hard."

Nip this entire discussion in the bud; just use a deterministic build process for any binaries you release. Like Gitian: https://gitian.org

I implemented this for Zcash (see https://z.cash/blog/deterministic-builds.html), more software projects should be doing this in general.

Curl is the back door.

A modern idiom is "curl https://$THING/ | sudo /bin/bash "

And the arguments around it being secure because it's https...

From the article:

"Additionally, I’m a Swede living in Sweden. The American organizations cannot legally force me to backdoor anything, and the Swedish versions of those secret organizations don’t have the legal rights to do so either (caveat: I’m not a lawyer). So, the real threat is not by legal means."

Considering how well Assange's things turned out with Sweden, that makes me wonder. Of course he is Australian, so there is a big difference.

As others write, this is not specific to curl and strict explicit egress filtering, the best (and imho only) safety, pain in the ass initially but avoid connections to any non whitelisted destination.

How do you know the egress filter is legit?

Well at some point one can never know things for sure. BUT, not running it on the device you want to filter would be a good step. And default deny.

Security is a matter of layers, there is not one layer that fixes all.

I guess I'm a bit cynical but this seems hand wavy to me. (Note, I love curl and implicitly trust it).

The argument that it would probably take too much code and would be too obvious doesn't seem solid. I'm no expert in this area but curl sends data over a network and sometimes runs as part of a larger application. It seems like the big dangerous bits are there and it wouldn't take a major bug to send the wrong thing.

> There is only one way to be sure: review the code you download and intend to use. Or get it from a trusted source that did the review for you.

Agreed. However, this is unfeasible/impractical. Ain't nobody got time for that.

Are there any commercial or non-profit organisations that maintain lists/repos of audited and trusted software with their checksums? It seems like there would a demand for a package manager like that.

I suppose some Linux distros have pretty OK repositories?

I wouldn't trust npm, pip, gem etc. though.

That's the second half of the quote:

> Or get it from a trusted source that did the review for you.

Incidentally, curl has been reviewed by the Mozilla Secure Open Source project [0], who maintain lists of audits which include checksums within the report. Maybe you're looking for something similar?

[0] https://wiki.mozilla.org/MOSS/Secure_Open_Source

Which version?

In order for this process to be truly secure, every single software version needs to be audited.

7.50.1 The repost can be found here https://wiki.mozilla.org/images/a/aa/Curl-report.pdf

You don't have to do a full audit on each version, auditing the deltas should be fairly comprehensive. Now, there could be some malicious code hidden that gets triggered by a begnin change, but otoh, no audit ever will guarantee full security.

How is there an argument that it would take too much code? He gives the example of how inserting an off-by-one error can lead to complete compromise of your machine. Inserting an off-by-one error shouldn't be too much code.

The best way is just defect density. How often do vulns show up? How serious are they? It's always a judgement call of course.

> No. I’ve never seen a deliberate attempt to add a flaw, a vulnerability or a backdoor into curl.

That's exactly what someone who has deliberately put a backdoor into curl would say.

Next paragraph after the one you quoted:

> If I had cooperated in adding a backdoor or been threatened to, then I wouldn’t tell you anyway and I’d thus say no to questions about it.

> Since most of them were triggered by mistakes in code I wrote myself, I can be certain that none of those problems were introduced on purpose


The issue here is now, how can we be collectively sure that the code is safe...

You can be sure that the code is unsafe. Along with almost all other code.


advice before you get hellbanned: jokes like this are frowned upon and against the forum culture. Best to giggle to yourself instead.

Yes I always forget, this forum is for serious discussion of 10x'ing only.

You don't get 'hellbanned' for dumb jokes.

Why does it say [flagged] when it's clearly [deleted] ?

I guess its because it was deleted by moderators instead of deleted by himself.

It was flagged by users and then 'dead' because of the number of flags. You can turn on 'showdead' to see it. It wasn't deleted by moderators, they're generally quite open in the (rare) cases they modify a comment or ban a user.

I believe comments can't be deleted after they've been replied to.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact