- Someone implemented a YAML parser that executed code. This should have been obviously wrong to them, but it wasn't.
- Thousands of ostensible developers used this parser, saw the fact that it could deserialize more than just data, and never said "Oh dear, that's a massive red flag".
- The bug in the YAML parser was reported and the author of the YAML library genuinely couldn't figure out why this mattered or how it could be bad.
- The issue was reported to RubyGems multiple times and they did nothing.
This isn't the same thing as a complex and accidental bug that even careful engineers have difficulty avoiding, after they've already taken steps to reduce the failure surface of their code through privilege separation, high-level languages/libraries, etc.
This is systemic engineering incompetence that apparently pervades an entire language community, and this is the tipping point where other people start looking for these issues.
If J2EE is a boring platform to you, pick your favorite and Google for a few variants. You'll find a serialization vulnerability. It's hard stuff, by nature.
* The bug in the YAML parser was reported and the author of the YAML library genuinely couldn't figure out why this mattered or how it could be bad.*
Do you have a citation for this? What particular bug in the parser are you referring to? The behavior which is being exploited is a fairly complicated interaction between the parser and client Rails code -- I banged my head against the wall trying to get code execution with Ruby 1.8.7's parser for over 12 hours, for example, without any luck unless I coded a too-stupid-to-be-real victim class. (It's my understanding that at least one security researcher has a way to make that happen, but that knowledge was hard won.)
Yes, this is always a bad idea. It's actually in a similar problem space as the constant stream of vulnerabilities in the Java security sandbox (eg, applets); all it takes is one mistake and you lose.
And thus, people have been saying to turn off Java in the browser for 4+ years, and this is also why Spring shouldn't have implemented such code.
> It's hard stuff, by nature.
Which is why deserializing into executable code is a bad idea, by nature. I'd thought this was well established by now, but apparently it is not.
> Do you have a citation for this? What particular bug in the parser are you referring to?
The original target of that claim was the Ruby community. With this comment allowing the same issue existing in the Java community, are you leveling the same claim against it? Does every severe security issue that remains unnoticed by a community for some time and is eventually noticed suggest pervasive engineering incompetence throughout that entire community? Maybe you would be entirely right to make that claim because any security issue is indicative of incompetence at some level, but I think the closer your definition of incompetence comes to including everybody, the less useful that definition is.
I'm not sure that means anything. In an OO language, you are always de-serializing into objects, and objects are always 'executable code'. Hashes and Arrays are executable code too, right?
The problem is actually when you allow de-serializing into _arbitrary_ objects of arbitrary classes, and some of those objects have dangerous side effects _just by being instantiated_, and/or have functionality that can turn into arbitrary code execution vector. (Hopefully Hash's and Array's don't).
It is a problem, and it's probably fair to say that you should never have a de-serialization format that takes untrusted input and de-serializes to anyting but a small whitelisted set of classes/types. And that many have violated this, and not just in ruby.
But if you can't even describe the problem/guidance clearly yourself, I think that rather belies your insistence that it's an obvious thing known by the standard competent programmer.
(I am not ashamed to admit it was not obvious to me before these exploits. I think it was not obvious to a bunch of people who are in retrospect _claiming_ it was obvious to.).
No. You're conflating code and state (which was the problem to begin with!)
Let's disassemble parsing a list of strings:
When you instantiate the individual string objects, you do not 'eval' the data to allow it to direct which string class should be instantiated. You also do not 'eval' the data to determine which fields to set on the string class.
You instantiate a known String type, and you feed it the string representation as an array of non-executable bytes using a method you specified when writing your code -- NOT a method the data specifies.
The data is not executable. It's an array of untrusted bytes. The string code is executable, and it operates on state: the data.
You repeat this process, feeding the string objects into the list object. At no point do you ask the data what class or code you should run to represent it. Your parsing code dictates what classes to instantiate, and the data is interpreted according to those fixed rules, and your data is never executed.
It should never be possible for data to direct the instantiation of types. The relationship must always occur in the opposite direction, whereby known types dictate how to interpret data.
> I think it was not obvious to a bunch of people who are in retrospect _claiming_ it was obvious to.
Given the preponderance of prior art, this seems unlikely.
It was from allowing de-serialization to arbitrary classes, when it turned out that some classes had dangerous side-effects merely from instantiation -- including in some cases, 'eval' behavior, yes, but the eval behavior wasn't in YAML, it was in other classes, where it could be triggered by instantiation.
To use your language, I don't think it's 'intellectual honest' to call allowing de-serialization to data-specified classes "a YAML parser that executed code"--that's being misleading -- or to say that a 'trained monkey should have known it was a bad idea' (allowing de-serialization to arbitrary data-specified classes).
There have been multiple vulnerabilities _just like this_ in other environments, including several in Java (and in major popular Java packages). You could say with all that prior art it ought to have been obvious, but of course you could say that for each of the multiple prior vulnerabilities too. Of course, each time there's even more prior art, and for whatever reason this one finally got enough publicity that maybe this kind of vulnerablity will be common knowledge now.
> It was from allowing de-serialization to arbitrary classes, when it turned out that some classes had dangerous side-effects merely from instantiation -- including in some cases, 'eval' behavior, yes, but the eval behavior wasn't in YAML, it was in other classes, where it could be triggered by instantiation.
That is eval behavior.
In a traditionally compiled OO language like C++, classes cease to exist after compilation; there is no fully generic way to instantiate an object of a class by data determined at runtime. So this whole concept of deserializing to whatever the protocol specifies goes completely out of the door.
(You can instantiate objects with classes specified by data in Java too, although Java isn't usually considered exactly dynamicaly interpreted. In fact, there was a very analagous bug in Spring, as mentioned in many places in this comment thread. But anyway, okay, sufficiently dynamically interpreted to allow instantiation of objects with classes chosen at runtime... is the root of the problem, you're suggesting, if everyone just used C++ it would be fine?)
In terms of that issue request, I doubt that adding a safe_load option would have stopped the Rails vulnerability. After all, the Rails guys _already knew_ that they should not be loading YAML from the request body; that's why it was not allowed directly. The issue was loading XML, which then allowed YAML to be loaded. Allowing YAML to be loaded there was a mistake; it seems unlikely that someone would make that mistake, while at the same time mitigating it by adding safe_load.
W.r.t RubyGems, I hear what you're saying, but that doesn't mean there's a bug in psych. Even the feature request of adding a safe_load option strikes me as problematic...either you're limiting the markup to json with comments, or you'd have to name the option something like sort_of_safe_load.
Wackiness ensued: http://blog.o0o.nu/2010/07/cve-2010-1870-struts2xwork-remote...
It would obviously be unfair to claim on this basis, or the recent problems with the Java browser plugin, that the "entire Java language community" has a bad attitude on security matters. Communities are big, each of them has a range of attitudes within it, and most importantly --- regardless of attitude --- sooner or later, everyone screws up.
the particular issue in the Yaml parser is explained pretty well here: http://www.insinuator.net/2013/01/rails-yaml/
First, given how many times I've seen a deserialization library "helpfully" allow you to deserialize into arbitrary objects in a language that is sufficiently dynamic to turn this into arbitrary code execution, evidence suggests this is not an accurate summary. I'd like to see "Don't deserialize into arbitrary objects" become General Programming Wisdom, but it is not there yet.
It's not like we live in a world where XSS is rare or anything anyhow. The general level of programming aptitude is low here. That's bad, regrettable, something I'd love to see change and love to help change, but it is also something we have to deal with as a brute fact.
Secondly, there's still the points of A: even if you don't use Ruby on Rails, your life may still be adversely affected by the Severity: Apocalyptic bug, and B: what are you going to do when the Severity: Apocalyptic bug is located in your codebase? And that's putting aside the obvious matters of what to do if you use Ruby on Rails and this was your codebase. The exact details of today's Severity: Apolalyptic bug are less relevant than you may initially think. Go back and read the piece, strike every sentence that contains "YAML". It's still a very important piece.
At which point a re-quoting of my favorite line in the piece is probably called for: "If you believe in karma or capricious supernatural agencies which have an active interest in balancing accounts, chortling about Ruby on Rails developers suffering at the moment would be about as well-advised as a classical Roman cursing the gods during a thunderstorm while tapdancing naked in the pool on top of a temple consecrated to Zeus while holding nothing but a bronze rod used for making obscene gestures towards the heavens." Epic.
I think that's pifflesnort's point.
You're definitely right that the security reports should be handled better. I hope that this whole situation results in a better security culture in the Ruby community.
Regarding your tone ("intellectually dishonest", "trained monkey", "systemic engineering incompetence pervades an entire language community"), it's a bit of hyperbole and active trolling. You are certainly right in many of your points, and you are certainly coming off as a jerk. It may not be as cathartic for you, but I'd suggest toning it down to "reasonable human being" level in the future.
The Rails community has exhibited such self-assured, self-promotional exuberance for so long (and continues to do so here), it feels necessary to rely on equivalently forceful and bellicose language to have a hope of countering the spin and marketing messaging.
Case in point, the article seriously says, with a straight face:
"They’re being found at breakneck pace right now precisely because they required substantial new security technology to actually exploit, and that new technology has unlocked an exciting new frontier in vulnerability research."
Substantial new security technology? To claim that a well known vulnerability source -- parsers executing code -- involves not only substantial new technology, but is a new frontier in vulnerability research?
This is pure marketing drivel intended to spin responsibility away from Ruby/Rails, because the problems are somehow advanced and new. This is not coming from some unknown corner of the community, but from a well-known entity with a significant voice.
I'll also raise an eyebrow at that particular sentence, though without spending much time looking into what's backing it I can only add that I too find that slightly incredulous.
I definitely question your stated intent. Were you to "counter the spin and marketing messaging", would that reduce the number of vulnerable machines? Overall, reduce the number of people that use Ruby/Rails, if that is your intent? Given the number of comments you've made to that effect versus the number of folks using Ruby/Rails, I'd suggest you have a very long battle in front of you.
Put another way, I perceive your tone as an exasperated, reactionary tone to a group that you happen not to like. If you are indeed trying to achieve some greater good here, I believe there's more effective ways you could achieve it.
Otherwise, just tone it down in the future. You had good points, there's no need to insult people from an effectively unassailable position.
I'd like it to be 'cool' in the Ruby community to apply serious care towards security, API stability, code maintainability, and all the other things that aren't necessarily fun, but are very much necessary to avoid both huge aggregate overhead over time, and huge expensive failures like this one.
I'd like to see a shift towards an engineering culture where taking the time to consider things seriously is considered 'cooler' than spinning funny project names, promoting swearing in presentations, and posting ironic videos.
It seems increasingly obvious to me that for this to occur, one can succeed in pushing back against emotive marketing with a similar approach, and thus shift the conversation.
Is that seriously what happened? It sounds oddly similar to the Rails issue from about a year ago (the one in which the reporter was able to commit to master on Github), even though I believe that was a separate set of developers altogether.
If so, then that might suggest a larger community/cultural issue, which makes me wonder what other exploits exist but haven't been reported (publicly) yet...
And the RubyGems folks are trying to handle this with whitelisting specific classes that the YAML parsing will still be allowed to instantiate:
We can either sit around throwing stones at them or pull up our sleeves and help. I'm not sure what there is to gain with the former.
And even if you get it wrong, you get it wrong in a different way. That might mean that you are technically more at risk, but so long as the attack is focused on getting as many targets as possible, rather than you explicitly, then that is arguably a great strategy: the cost of adapting an already existing attack to target a novel target is going to be astronomically high, versus using an already existing vulnability. If you are refining neuclear material for Iran, you are going to need all the protection you can get; if you are just another start-up you just need not to be vulneable to the latest drive-by exploit.
There is no karma here, there is just a race to the bottom for all of us. I thought the point of OS was for us all to group together and find and address these issues?
You know, kumbaya and all that...
Failure to blacklist non-conforming input.
Really, it is that simple and that complicated.
Edit: I'm genuinely interested - I always try and whitelist things when I'm building software. Although I have next to no background when it comes to security in particular.
Whitelisting is what the rubygems folks are doing to work around this problem until a better implementation is put in-place in the YAML parser.
Generally, it is a better solution but it is more difficult and can break a lot of dependencies if not implemented correctly.
Yes, you are absolutely right.
Hey, yes, yaml bug is _very_ similar. Whitelist is better than no list at all
Also, more than other communities, Ruby has a cultural gap between the people developing the language and core libraries and the people using it to write web apps and frameworks.
Here's two good technical writeups of the exploit as it applies to Rails apps: http://blog.codeclimate.com/blog/2013/01/10/rails-remote-cod... http://ronin-ruby.github.com/blog/2013/01/09/rails-pocs.html
My point is that it's 'taken so long' because all this code is stuff that was written in a totally different time and place. And then was built on top of, after years and years and years.
Now that it _is_ being examined, that's why you see some many advisories. This is a good thing, not a bad one! It's all being looked through and taken care of.
And then, as someone else said, becuase of layering. The next downstream user using YAML might not have even realized that YAML had this feature, on top of not realizing the danger of this feature. And then someone else downstream of THAT library, etc.
Maybe it _should_ have been obvious, but it wasn't, as evidenced, as you say, by all the people who have done it before. After the FIRST time it was discovered, it should have been obvious, why did it happen even a second?
In part, becuase for whatever reason, none of those exploits got the (negative) publicity that the rails/yaml one is getting. Hopefully it (the dangers of serialization formats allowing arbitrary class/type de-serialization) WILL become obvious to competent developers NOW, but it was not before.
20 years ago, you could write code thinking that giving untrusted user input to it was a _special case_. "Well, I guess, now that you mention it, if you give untrusted input that may have been constructed by an attacker to this function it would be dangerous, but why/how would anyone do that?" Things have changed. There's a lot more code where you should be assuming that passing untrusted input to it will be done, unless you specifically and loudly document not to. But we're still using a lot of code written under the assumptions of 20 years ago -- assumptions that were not neccesarily wrong cost/benefit analyses 20 years ago. And yeah, some people are still WRITING code under the security assumptions of 20 years ago too, oops.
At the same time, we have a LOT MORE code _sharing_ than we had 20 years ago. (internet open source has changed the way software is written, drastically) And ruby community is especially 'advanced' at code sharing, using each other's code as dependencies in a complex multi-generation dependency graph. That greatly increases the danger of unexpected interactions of features creating security exploits that would not have been predicted by looking at any part in isolation. But we couldn't accomplish what we have all accomplished without using other people's open source code as more-or-less black box building blocks for our own, we can't do a full security audit of all of our dependencies (and our dependencies' dependencies etc).
Of course, you could argue that developers should always be thinking about and searching for security related issues in whatever field they're working in, but that doesn't appear to be the norm at the moment.
I thought you could unpickle untrusted input in Python? Sure there's a great big red warning message on the documentation, and hence it's currently rare for people to do it, but it is technically allowed, right?
This is master level, "Captain Obvious"-style trolling, beyond me how this is the top comment in a place like HN.
Someone implemented a YAML parser that could serialize and de-serialize arbitrary objects referenced by class name.
It was not obvious that this meant it 'executed code', let alone that this meant it could execute _arbitrary_ code, so long as there was a predictable class in the load path with certain characteristics, which there was in Rails.
In retrospect it is obvious, but I think you over-estimate the obviousness without hindsight. It's always easy to say everyone should have known what nobody actually did but which everyone now does.
As others have pointed out, an almost identical problem existed in Spring too (de-serializing arbitrary objects leads to arbitrary code execution). It wasn't obvious to them either. Maybe it _should_ have been obvious _after_ that happened -- but that vulnerability didn't get much publicity. Now that the YAML one has, maybe it hopefully WILL be obvious next time!
Anyhow, that lack of obviousness applies to at least your first two points if not first three. It was not in fact obvious to most people that you could execute (arbitrary) code with YAML. If it was obvious to you, I wish you had spent more time trying to 'paul revere' it.
> The issue was reported to RubyGems multiple times and they did nothing.
Now, THAT part, yeah, that's a problem. I think 'multiple times' is 'two' (yeah, that is technically 'multiple'), and only over a week -- but that still indicates irresponsibility on rubygems.org maintainers part. A piece of infrastructure that, if compromised, can lead to compromise to almost all or rubydom -- that is scary, that needs a lot more responsibilty than it got. We're lucky the exploit was in fact publisizied rather than kept secret and exploited to inject an attack into the code of any ruby gem an attacker wanted -- except of course, we can't know for sure if it was or not.
Er, there would have been trouble on that end too ...
Indeed. It's the "fallacy of gray". Nothing is black or white, hence everything is gray. Nothing is 100% secure, nothing is 100% insecure, hence everything is "semi-secure": it's bad, but not too bad, because every language / API / server can be attacked.
You've effectively substituted a black/white dichotomy with something even worse: instead of having only two options (black or white), you now only have one: gray.
It is probably one of the most intellectually dishonest logical fallacy of all times and we keep seeing it more and more.
It's really concerning.
There are many developers who are not presently active on a Ruby on Rails
project who nonetheless have a vulnerable Rails application running on
localhost:3000. If they do, eventually, their local machine will be
root your Macbook if it is running an out-of-date Rails on it. No, it
does not matter that the Internet can’t connect to your
localhost:3000, because your browser can, and your browser will follow
the attacker’s instructions to do so. It will probably be possible to
eventually do this with an IMG tag, which means any webpage that can
contain a user-supplied cat photo could ALSO contain a user-supplied
remote code execution.)
In addition to common port numbers and stuff like redmine, their tipoffs include looking for Rails-style session cookies, and HTTP response headers emitted by Rails or support machinery. These include "X-Rack-Cache:" and the "X-Powered-By:" header that Phusion Passenger tosses in even if you've configured Apache itself to leave version numbers and component identifiers out of the response. (I'm not sure there's any better way to suppress this stuff than adding mod_headers to the Apache config and using "Header unset")
There is also a lot less headaches once you've decided to move it into production.
NO EXPLOIT FOR LOCALHOST:3000 calm down
You see this in things such as security issues being marked as wontfix until they are actively exploited (e.g. the Homakov/GitHub incident), in the attitude that developer cycles are more expensive than CPU cycles, and on a more puerile level in the tendency towards swearing in presentations.
I've always had the impression that the Rails ecosystem favours convenience over security, in an Agile Manifesto kind of way (yes, we value the stuff on the right, but we value the stuff on the left even more). One of the attractions of Rails is that it is very easy to get stuff up and running with it, but some of the security exploits that I've seen cropping up recently with it make me pretty worried about it. I get especially concerned when I see SQL injection vulnerabilities in a framework based on an O/R mapper, for instance.
Many start-ups are built by well-meaning people who have no formal CS or even engineering background and thus are somewhat out of touch with what it means to build a robust system. It's natural for people to focus on "what's important" and ignore boundary/edge conditions, while in reality 90% of sound engineering is getting boundary/edge cases right.
And as most of such start-ups use Ruby/Rails due to the easiness of "getting it up and running", and thus they inject the Ruby/Rails ecosystem with this "focus on what's important" mindset, important boundary issues, including security, are neglected.
I think in 2006/2007, there was a simplicity to the basic "get up and running" aspect, but Rails 3.x+ is a pretty large ecosystem with quite a lot of decision points to educate yourself on to do any sized project beyond 'hello world'.
The exploits have happened in ways that have exposed and hammered home the myriad places many applications expose unexpected side channels and larger attack surfaces than you'd think. These issues have opened a broader range of people to vulnerability, and I think opened a lot of people's eyes to the need for a sense of security and what that really means.
Top that with the level of explanation we've seen in at least the Rails and Ruby exploits, it's been a tremendous educational opportunity for a lot of people who will benefit greatly from it, and by proxy their users.
When the idea of a "SQL Injection" first became really prevalent, we saw an uptick in concern for security amongst framework developers, as far as I could tell. I think this will help get some momentum going again.
Speaking as a non-expert on the subject, security is all about a healthy sense of paranoia, across the board :)
I was going to post something similar. Also we often see people insulting others when they post exploits too early or describe exploits in depth too early. Posting stuff like: "You're an .ssh.le, wait a few days before posting that".
I don't think so. I think exploits should be publicly posted as soon as possible and affecting as many people as people. Maybe even damaging exploits, actively deleting users data or servers data.
The bigger the havoc, the sooner the entire industry is going to realize security is a very real concern.
People are still considering buffer overflow, SQL injection, query parameters objects instantiation through deserialization exploits, etc. to be "normal" because "everybody creates bugs" and "a lot of bugs can be exploited".
I think it's the wrong mindset. Security if of uttermost importance and should be thought of from the start.
For example I'm amazed by the recent seL4 microkernel which makes buffer overflow provably impossible (inside the microkernel) or even the Java VM (the JVM) which makes buffer overflow in Java code impossible. It's not perfect (we've seen lots of major Java exploits, but zero were buffer overrun/overflow in Java code... Some in 3rd party C libs, but zero in Java code. Some other Java exploits too of course, but zero buffer overrun/overflow).
So security exploits are not a fatality.
All we need is people, from the very start, to conceive systems more resilients to attacks.
The more attacks, the more exploits, the more bad reputation and shame on clueless developers, the better.
I actually start to love these exploits, because they fuel healthy research by the white-hat community.
And one day we'll have more secure microkernels, more secure OSes, more secure VMs, more secure protocols, etc.
Let them security exploits come.
If you are like me, you would expect that YAML was used in the configuration files and nowhere else. A small framework like Sinatra wouldn't have been big enough to hide an issue like this.
I don't mean to beat on the Rails guys too hard though, they're off shipping stuff and I'm not and I'm not very fond of those who criticize while a safe distance from the action. But I think it's fair to say that this could have been foreseen earlier (or much earlier, depending on who you ask).
I understand the appeal of "magic" to solve issues when you are under a deadline. It is just that trusting it is dangerous.
What technology is he talking about here?
When I first read your blog post I got the impression that you were saying that the YAML vulnerability were found with some new code scanning technology that lets us find bugs in Rails faster. Or are you just saying discovering the existence of the YAML.load() class of vulnerability is "new security technology?"
Or are you talking about the ronin support module people are using in some of the PoCs?
+ Some objects are unsafe to instantiate if you don't pick all values you initialize them with very carefully.
+ YAML can instantiate objects from any class.
+ Rails uses YAML, in a lot of ways.
You might have said "Yes, I am aware of all these three things. Do you have anything important to tell me?" Now, if I demonstrate to you working PoC code which combines those three into remote code execution, the substantial work involved in producing that PoC code -- finding the vulnerable classes which ship with Rails, demonstrating how to get data from where the user controls it into the far-inside-the-framework bits where Rails might actually evaluate YAML, etc etc -- immediately starts suggesting lots of other fun ways to use variants of that trick.
An experienced engineer ought to have said "this is a perfect storm, and it is wrong that YAML can instantiate objects from any class, and there will be a vulnerability here".
The reason such an engineer ought to say this is because 1) In general terms, it should be self-evident that any system built on riding the edge of risk will fail, and 2) We have countless examples over decades of this exact issue occurring repeatedly.
If you need a PoC to understand the severity of such an issue, you do not have the proper engineering mindset to be writing secure code. This was a lesson much of the industry learned in the 90s, where it was necessary to provide a PoC before many developers would take action on an issue.
Over the past few years people have developed technology to make it easier to exploit null pointer dereference bugs for instance. That doesn't mean they weren't security bugs before we were good at reliably exploiting them.
Exploitation techniques increase the impact of vulnerabilities certainly, but the 3 facts you stated above would indicate a security issue even before we knew the right class to instantiate.
However if told any one of them you might not worry enough even if the other facts were somewhere deep in your background knowledge.
Yes, it's super easy to call everyone involved with the YAML library incompetent, but let's be honest - they're not, in general. They fucked up here, and hindsight is 20/20, but I think it's only face-stabbingly obvious now because of what's actually happened.
I do more Perl but I can tell you that "this deserializer can create new arbitrary objects" would give screaming alarm bells. And that is because there is a long history of trying (and failing) to safely do stuff like this (e.g. note the lack of warrant for the Safe CPAN module: http://search.cpan.org/~jhi/perl-5.8.0/ext/Opcode/Safe.pm )
Python has the same well-known and well-documented issue with their pickle module.
In general using /any/ deserializer that can create arbitrary objects of arbitrary classes has been known to be a bad idea for some time, and as far as I can tell Ruby YAML documents that it supports doing exactly this: http://www.yaml.org/YAML_for_ruby.html#objects
So if we were talking about a security vuln from something like JSON where we expect benign data to be the only possible output I think I'd agree completely.
Using a deserializer even more powerful than that is at the very least a bad smell from the POV of security, especially post-Spring (fixed in 2011, even if it was re-iterated 2013), so I wouldn't be so quick to claim this could only have been predicted in hindsight.
I get that it's still hard work to move from "I made an object of my choosing" to "framework pwned" but you pretty much have to assume that the former implies the latter nowadays. It was more than 5 years ago now that tptacek was gushing over Dowd's "Inhuman Flash Exploit" and I somehow don't think that pen testers and security experts have gotten any dumber since then. ;)
I think that to suggest that this bug could not have been found before is wrong, but the reason we're seeing such a cascade is because security almost never happens in a bubble.
Previously you had to send something to rails and find a way to cause rails to execute that. Not so easy.
Now? You just have to send some YAML to rails.
I had been meaning to get some context for the recent spate of security problems and this provided that in spades. Thanks for taking the time to write it up and post it.
Who was the first reported compromise of a production system?
1) Is it currently safe to "bundle update" and be confident that only verified Gems will be provided? I don't mind errors on any unverified ones but don't want to download them.
2) Is there a drop in replacement for RubyGems? The problems that have occurred this month would have been multiplied if RubyGems was unavailable at the time Rails had an apocalyptic bug.
1. I wouldn't say so. Not until they're all the way through.
2. Not at the moment, but general guidance is that we should all have local gem repos that we maintain ourselves and only rely on external sources when needed. It is something I'm going to look into ASAP.
It's a shame that they seem to have put the service back up in an unsafe mode, I would have hoped that they could have quarantined the unverified Gems.
Edit: Looking at the status page the API is down so it can't be accessed from Bundler so they are doing it the good/safe way.
Obviously then it is up to you to verify everything, including that you're using the right versions and what not.
I hope they learn from this and stop chanting "convention over configuration" when told that explicit is better than implicit.
Or should I basically just not run Rails on any machine ever anymore, get a different web server, and start implementing my own request routing and ORM without any sort of YAML-parsing magic?
>One of my friends who is an actual security researcher has deleted all of his accounts on Internet services which he knows to use Ruby on Rails. That’s not an insane measure.
So anyone who uses Twitter, for example, could have their passwords and other data stolen through this exploit?
Long story short: There's a variety of things that can be done to mitigate this vulnerability and an active conversation on which is the best option. My go-to suggestion would be having Rails ship with either a non-stdlib YAML serialization/deserialization parser or have it modify the stdlib one, with the major point of departure being "Raise an exception immediately if the YAML encodes any object not on a configurable whitelist, and default that whitelist to ~5 core classes generally considered to be safe."
That is astonishingly unlikely to be a net-win for your security.
I'd expect that Twitter (in particular) has a better handle on it than your average startup, but successful exploitation of this means the attacker owns the server, if the attacker owns the server they probably get all the servers, and they will tend to gain control of any information on all of the servers. That can include, but is certainly not upper-bounded by, passwords/hashes stored in the database. It is absolutely possible, and indeed likely, that many people will be adversely affected by this vulnerability without themselves running Rails or even, for that matter, knowing what Rails is.
>That is astonishingly unlikely to be a net-win for your security.
In the long run, you are probably right. Once this gets fixed, which will probably be soon considering how much attention is on it.
But in the short run, is there anything worse than a vulnerability that allows a remote attacker to automatically detect, penetrate, and execute arbitrary code on your machine? To the point where it's not even safe to run the framework on localhost on your dev box?
By making that the default schema, developers would have to explicitly request the dangerous "ruby" schema that makes arbitrary Ruby objects.
My question: do these security issues affect Sinatra apps?
Why are you running Rails as the root user? This is a bad idea.
3. When you load that URL, it causes code execution, causing your computer to open a connection somewhere and start taking instructions.
4. The instructions that arrive includes downloading and installing software that takes advantage of known local root vulnerabilities in OS X.
5. Congratulations! Someone rooted your machine!
To create a GET, inject an <img>, <script>, <iframe>, or <style> tag. (Or several others.)
To create a POST, inject a <form> tag, and call form.submit()
No, it is much worse.
I am still convinced that configs and templates should be treated as executable code and are best implemented in the same language they're used from. At least it makes certain things blatantly obvious. (It also makes a lot of other things possible without any extra coding/learning.)
So I think it only helps if you are likely to need to deploy additional/alternative servers of the same versions. For significant deployed services this makes sense but if you are only in development/testing or using a service like Heroku it doesn't really help you very much does it?
At least your deployments will be consistent. This is a great starting point. Now all you have to do is check your cache against the backdoored version, and you instantly and verifiably know where your deployment stands.
bundle install --local
There is also https://github.com/dtao/safe_yaml (hat tip @patio11, who also points out that this has not been audited for completeness/correctness)
I could see it as a service company that shares blacklist info between sites and can even find new exploits from the "bad" requests.
There was a time when anyone who claimed to have the ability could design and build things like bridges and buildings. After enough of them collapsed due to repeated, avoidable mistakes, we said no, you can't do that anymore, you need to be licensed to design and build buildings, and furthermore you have to follow some basic minimum conventions that are proven to work. And you and your firm has to take on personal liability when you certify that your design and construction follows those basic best practices.
It would be good if all this was a clarion call to the Ruby community to improve things holistically, rather than the current trend of band-aid fixes they seem to apply.
Every popular technology goes through this. (C, Java, PHP, etc)
What is encouraging to me is the speed with which these issues get patched in Ruby and Rails, and how the ecosystem is paying attention to these lessons and learning from them.
Contrast this with the length of time recent Java flaws took to get patched (6 months or more) or some of the bugs reported in TOSSA got fixed years later.
The deal is to learn from each of these incidents.
Very few people want to take the trouble to write and use correct programs. We, as an industry, would rather Ship Early and Often. It takes a lot of energy and time endeavor to write correct programs. Very few do that. Three that come to mind are Dijkstra, Knuth, DJB.
Because other frameworks are rock-solid. Yup. None of this happens anywhere else on the internet.
It does happen everywhere. It should be stopped everywhere. But it happens more frequently in some places. There are special conditions that permit it to happen in some places.And if it is a serious concern of yours, knowing where it is and isn't most likely likely to happen again is important.
The Perl YAML warning is less obvious but they at least mention in their LoadCode docs (http://search.cpan.org/~mstrout/YAML-0.84/lib/YAML.pm) that you have to specifically enable code deserialization since untrusted evaluation is a bad idea.
Python's YAML is only slightly worse, with an available safe_load method that refuses to run code (and a failure to use appropriately led to vulns in popular Django plugins a little more than a year ago).
There's no easy equivalent to safe_load or UseCode for Ruby's YAML (http://apidock.com/ruby/Psych) as far as I can tell, at least while still using the high-level parser. And I'll note that the API docs I provided are for the new YAML parser introduced with 1.9.3. I would like to think that by 2010 there would be a general awareness of the risk of using deserializers/code emitters on untrusted input.
In Common Lisp, for example, as far as I know you can set a flag so that the reader is set to "no evaluation ever" (if I understand things correctly) and, hence, if you're not using eval yourself specifically, nothing is ever going to be evaluated.
But how would that work in Clojure? And what about other languages? Ruby? Haskell? Java? C#?
I think the ability to execute code became the most important security issue (more than buffer overflow/overrun which can now be prevented --even sometimes provably impossible to happen thanks to theorem provers).
More thoughts should be put into explaining how/when a language / API can execute code and how it should/can be used to prevent such a thing from happening.
As someone who loves Rails, to someone who presumably likes Rails, it is imperative that you understand how serious this issue is. If you use Rails, you need to have addressed this already. If you have not, drop what you're doing and go fix it right now.
The Fear seems appropriate.