When I was working as a pen tester I would completely scold developers for letting this happen - telling them that with everything we know today about security and good programming practices there is no way you should allow that to happen. Off by-one bugs, timing attacks etc. are more excusable, but this, this is just amateur hour.
That was 11 years ago.
The problem is, these kids are from college. They don't teach you stuff like "writing a secure web application" in college, or even try to.
(Not that this is unreasonable, though perhaps I'm suggesting that there should be different career paths for CS majors and people who intend to be professional programmers. (I say as a CS-educated professional programmer))
Imagine if your nurse came out of college having never stepped foot into a hospital, having only read about how to take vitals and such, but never having done it on a live human being.
The real reason why we have an issue producing quality programmers is more fundamental; we just don't have enough people who are good at teaching it. It's not because we "waste" time on one or two courses on some theoretical aspects.
Another thing to consider is allowing Computer Science majors to opt out of general education requirements in favor of more programming classes. Allowing this would free up a semester or more for most undergraduates (as opposed to the one course saved from cutting theory). Even with just average teaching, a semester can make a big difference.
Nurses and especially Doctors have years of pure science before they're ever allowed into their trade schools.
Likewise there are a variety of software careers, from sysadmin to developer to architect that require varied levels of education (though much like nurses developers can only benefit from better understanding of the principles behind their art.)
This is an (unnnecessary) North American tick, whose pernicious influence is spreading.
Medicine has traditionally been an undergraduate degree in Europe and all former European colonies aside from the US, and those countries that are in its cultural sphere (like S. Korea, which got rid of its undergraduate medicine degrees.) My cousin started his Medicine degree at 17. It will take him five years. It's not like this is even unknown in the States, IIRC UCSD has a runaround where you get a Bachelor while doing an M.D.
And even in the US there are different types of nurses, some of whom went to college, some who didn't (LPN, RN and Nurse Practitioner). I understand demanding continuing education and testing to ensure competency, but college is a means of doing that, but not the only one.
That's one of my biggest regrets about my CS degrees (BS/MS). I took so many classes and did so many class projects that I never had to see one project all the way through to complete, tested, usable release.
And I don't think it's a CS department's responsibility to expect class projects to be built to a shippable standard. However, I do think that it's the school's duty to encourage students to work on a real, production project, of their own creation or as a contributor, on their own time - even if it means taking on a reduced academic load.
But that's why the apprenticeship model would work so well. I've always thought the way union electricians are trained (a four year apprenticeship, which includes a lot of practical and theoretical schooling, until becoming a journeyman) would be a great fit for software development.
Over 5 years, students (usually aged 14 when they enter the school) receive practical training in programming as well as a theoretical foundation, though in no way as thoroughly as in any university program.
In the first year of the informatics branch they let students enjoy the beauty of programming linked lists in C, which is quite tough for many.
...and that's exactly what people are complaining about: the Diaspora devs learning about software development practices in the beginning of their career, while writing code for release.
I see Master's level students all the time who don't even know the basics of programming. That's just not acceptable and gives the university a bad rep.
Simply put, they focused too much on trendy tools and libraries like MongoDB and CarrierWave and neglected the basics.
Validating user input is probably the first thing you learn about web application programming, which is frequently taught at universities, or in books titled "web application programming" which you should at least skim if you're going to start a project like this. Don't blame college for this. Just because it is something that isn't focused on in college (it is, though), and they went to college, does not mean it was college's fault. Would it be fair to blame college for any other mistakes they made, just as long as college did not "focus" on it? No. Some things are common sense.
Most likely, the culprit was time constraints, which is far more excusable.
Bingo. I went to Georgia Tech, which has a pretty damn good CS program, and I had to hunt for security classes. One was a "special topics" course that wasn't available very often and didn't have anything to do with application security (was a Net. Sec. course). The other was not a CS course but a Comp. Eng. course and was focused on penetration testing. :/ I actually earned a "Network Security" certificate with my degree which I never even knew was available (it wasn't mentioned anywhere in the course literature).
Since I've graduated they have redone the whole CS dept so I don't know if things have changed, though.
But like someone else said, a lot of this stuff is common sense, especially if you're a programmer and have systems knowledge. And I think most programmers have the habit of imagining all the different ways things could break when they are coding, too. Like a hackers curiosity that most of us share. I know when something looks obviously wrong on a website or in an application I'm using I start to poke around and see what I can uncover.
We had a class on it. They basically pushed us through OWASP from front to back :)
I might be able to excuse this since they're fundamentally still in alpha (or pre-alpha) and were rushing to get code out.
I wouldn't. Authorization is the sort of thing that has to be done first.
First, mass assignment.
The answer to mass-assignment bugs is "attr_accessible". Accessible attributes can be set via update/build/new; nothing else can. Every Rails AR model should have an "attr_accessible" line in it.
I've met smart dev teams working under the misconception that attr_accessible means "these are the attributes that can be changed based on user requests", and so virtually everything is made accessible. No! If something's not attr_accessible, you just set it manually (user.foo = params[:user][:foo]). It's not painful and the extra line expresses something important ("this is a sensitive attribute"). Attributes are inaccessible until they prove themselves mass-assignment-worthy.
Second, the string interpolation in the regex.
Real quick: don't ever let users interpolate arbitrary strings into regular expressions. Regular expression libraries are terribly complicated and not very well tested. To illustrate (but not fully explain) the danger here, run this line of code:
ruby -e "'=XX===============================' =~
Oh, one more thing: I appreciate Patrick's take on systems failures breaking Rails apps before underlying crypto flaws will, but even if they had protected their keys, their crypto wouldn't have worked. Don't build things that require crypto. You aren't going to get it right.
I'd do you one better: use an initializer to monkeypatch ActiveRecord::Base and fire "attr_accessible nil", which will cause mass assignment to fail on any object you create from a class which doesn't make the assignment explicit.
 Obviously large companies with massive Ruby code bases can't really do this. Not sure what to say there.
For your companies' code reopening a class should be a huge flag in code review (something like gerrit should be in place at every large company), but it's not sustainable to police the dependencies of the libraries you use, especially when the default in the Rails community is spray and pray.
@question.safe_update(%w[title body language tags], params[:question])
"Don't let users interpolate, ever" is close to truth. It isn't quite truth, but it's a lot shorter than the truth.
Why does that hang in Ruby? In Perl it's fine...
anchored "X" at 0 floating "X" at 2..2147483647 (checking floating) minlen 3
Guessing start of match in sv for REx "X(.+)+X" against "=XX==============================="
Found floating substr "X" at offset 2...
Contradicts anchored substr "X", trying floating at offset 3...
Did not find floating substr "X"...
Match rejected by optimizer
I wrote an article a while back about the differences between PCRE and Perl's engine: http://use.perl.org/~avar/journal/33585
Regexp engines are subtle beasts, and there's a couple different ways to implement them (DFAs vs NFAs, simple engines vs lots of clever special cases, etc.). See O'Reilly's "Mastering Regular Expressions" for an exhaustive discussion.
You should use a regex engine that's explicitly designed to take potentially hostile input. Like the Plan9 engine, or Google's re2 engine which powers Google Code Search.
You can also just use Ruby's dangerous PCRE engine if you do something like forking off another process with strict ulimits which executes the regex for you. Then you can just kill it if it starts running away with your resources. Look into how e.g. evaluation bots that work on the popular IRC channels on FreeNode are implemented. POE::Component::IRC::Plugin::Eval on the CPAN is a good example.
I'd just scrub the hell out of strings before passing them to a regex engine.
The key though is to make sure that nobody ever sees that code. Hopefully it will be locked away in some intranet vacation-time-planning app that nobody will ever dig into. That way you can look back at it in shame, but few others will ever know about it.
So here we have a team of people who have clearly never built anything at all, trying to learn on the job while being scrutinized by the entire world and actually submitting their code for public review.
God help them.
I was wrong.. these aren't really security "holes" as that's not strong enough a word. I think the best way to put it is they accidentally created the first social network wiki.
I agree though, these aren't like subtle security holes that would need a security expert to review. Checking that a user own the resource on which they are requesting modification is basically common sense.
If it were common sense to do it, they would have done it. It's not. It's a very distinct thought pattern shift from "the browser is a part of the execution of our code and it will only try a delete link which the code has generated" to "the user can request anything at any time no matter what links we have or haven't generated or what they can see on screen".
It's a learned shift specific to some subsets of some kinds of computer programmers, not at all "common sense".
(and besides, even if it were common sense, what's the point in your comment then?)
Your assumption that everyone shares in common sense equally is a bit optimistic.
So, then you must agree that they clearly don't understand, as you say "the user can request anything at any time no matter what links we have or haven't generated or what they can see on screen". To me, this shows a lack of understanding of basic guidelines of web programming, namely that you can never, never trust user input, whether it's form submissions or cookies.
Perhaps not common sense, but nor is it an advanced principle. If you've ever used Firebug for more than a couple of hours, you'd have figured out on your own that you can change forms and then submit them. If you've even used a browser for a while, you will have realized you can type in different numbers in query strings. If they haven't noticed that by now - what are they doing taking on a project like this?
What I really would like to see is a documented protocol - based on XMPP or some other established, well-tested protocol would be good, but if not then at least something.
Once you have that protocol - which tells you how Diaspora "seeds" communicate securely - you can let others build their own implementation, using Rails, PHP, Python, doesn't matter. Sure, release a reference implementation in Rails, but the protocol is the most important thing.
Unfortunately what we have is just another Facebook clone done in Rails, which is disappointing.
While meeting both of those objectives in those timescales might be possible it would be a truly remarkable achievement. Not surprisingly it didn't happen and they released something that pleased nobody - all we can hope for is that they learn some lessons and move onto better things.
Nobody stepped up to write a decent client, and the product was judged (unfavorably) on the merits of the reference implementation.
I realize these guys are in college, but they really should have (a) brought people's expectations in line with their abilities and (b) reached out to experienced developers to help them out. Intridea probably would have given them a few developer hours per week to help code, advise them, and so on. Just a little input and guidance would have saved them a lot of grief.
I'm inclined to think it was neither and they just didn't think anyone would notice. It happens.
I lol'd. Mind if I use that?
SQL databases are also well understood (for eg. in MS-SQL I can stop the remainder of the statement from executing with '--'). MongoDB with its JS engine is still a big unknown.
"...secret squirrel double-plus alpha unrelease..."
Mind if I use that? It would be a terrific title for an animal fighting game I've been itching to make.
1. When media hype provides you with $200k, you're still best served by bringing people's expectations down to earth. There was no way they were going to be able to build anything approaching a Facebook killer in 3 months, and it would have been best if they would have made that clear in the beginning.
2. There are a number of projects like Diaspora that have been working for years towards the same exact goal. The only way Diaspora could have succeeded where those have failed (or not-yet-succeeded), is if they had properly articulated where they would go right where others had gone wrong. If you can't do that, you probably don't have the perspective necessary to take on such a huge project.
3. We all need to be less susceptible to the story of the "boy wonders" taking on the establishment. Between Diaspora and Haystack+, these should be sobering lessons about the dangers of hype and a good human-interest story when the result we need is stable, well-written code.
4. Open source isn't magic. Rails isn't magic. Having a good idea and a whole lot of heart isn't magic. It's true that the software world isn't exactly a meritocracy, but at the same time, we need to recognize that you have to be able to build something generally usable, and if you can't, there's nothing that will save you except for harder work and learned lessons. Some bugs will be fixed by the interest in Diaspora, but there are big architecture questions here that need to be resolved, and coordinating that democratically through the internet in a sea of strangers is a logistical nightmare. And while rails does provide a lot of functionality out-of-the-box, a project of this size isn't held up by how long it takes to write the photo-uploading code, it's held up by the big-picture stuff that rails can't really help you with. In the end, programming is programming. You need specs, mockups, user stories, documentation, all kinds of unsexy stuff.
I think we'll probably see Diaspora stabilize into something usable at some point. But I'm very doubtful that will be anytime soon, and I'm especially doubtful that it will be before the other projects (Elgg, Appleseed, OneSocialWeb, StatusNet, etc) mature into the facebook killer people want to see.
Building the kind of open source social networking software necessary to take on Facebook at it's own game is such a massive, complex undertaking, and it's such undiscovered territory, that there really is a big disadvantage to being the new kid on the block.
There were red flags in their kickstarter post:
* "We are four talented young programmers from NYU’s Courant Institute trying to raise money so we can spend the summer building Diaspora"
* "Diaspora knows how to securely share (using GPG) your pictures, videos, and more."
* "We have a plan, a bunch of ideas and the programming chops to build Diaspora. What we need is the time it takes to iron out a powerful, secure, and elegant piece of software. Daniel, Ilya, Raphael, and Maxwell are all ready to trade our internships and summer jobs for three months totally focused on building Diaspora"
* "We promise to you that Diaspora will be aGPL software which will released at the end of the summer."
They also said that no similar system exists (they do)
Isn't it though? I think popular open source projects have magic. These guys put out a demo with lousy unsecured code and rookie mistakes and within a week the worst offenders were identified and repaired.
I'm not saying they are great developers, but certainly OSS is some sort of powerful magic.
Open source is not good for making use of the "Many eyeballs" for architectural decisions. It's where the wisdom of crowds provides little benefit. This is why most open source projects are not a haphazard bazaar of stone soup contributors, like most people think, but are actually small dedicated teams of talented software developers, who are usually paid.
I think our conception of what OSS can achieve, simply by being OSS, is somewhat inflated.
(Don't get me wrong, though, OSS is fantastic, and I think it's, by design, much better than close source)
Even if the writers didn't explicitly write chapters dedicated to security, obviously even that wouldn't be enough, they should at least note which code is entirely unsafe in production, and where you can learn more about it.
Every rails book is filled with this sort of code that people learn, and then use.
(exampleCode != productionCode)
 http://guides.rubyonrails.org/security.html (Well-written, like the other guides. Totally worth reading fully).
 http://www.owasp.org/index.php/Top_10_2010-Main (Open Web Application Security Project's top application security risks for 2010)
(I favor this and agree with you, incidentally.)
What would be a nice one-page security guide would be a 'lil bobby tables' guide to databases - SQL injection for any database - (SQL or NoSQL) - the goal being to help developers prevent these attacks.
That's a challenging proposition even for experienced teams.
Though we all know that they warned us that the software was still full of bugs and treated as experimental.
Patrick says that doesn't matter, that they haven't got the foundation right and anything built on top of it will likely fall.
Maybe they can still iterate and fix those things. Maybe they will scrap large sections of the code and re-write it properly. I suggest they keep trying and learn something from all of this. Us older geeks can be pretty harsh sometimes. We often expect new-comers to not make the same mistakes we did once. That's how we learn though, so don't take it the wrong way.
Next time just don't promise more than you can deliver.
What they've released looks like your average weekend github side-project. I suspect a large % of Hacker News members have projects like this (though hopefully with better security ;-)). So what did they spend the money on ?
As many have pointed out, they do not teach this stuff in University and acquiring such knowledge you tend to have to be very proactive about your development if you do not have industry experience.
So thanks a lot for sending these nuggets our way.
You stated that the team is manifestly out of their depth in terms of web security, how do you suggest that they proceed given their month deadline?
Can they pay a security expert to resolve these issue or no one wants to come within a mile of this?
I don't know if you can get a security expert to fix this for you. You can certainly wave a big enough check around to get somebody to look over your code, but that won't magically improve code you haven't written yet. Also, it is highly likely that Diaspora is architecturally insecure -- that, beyond the "Oops, didn't check the input" code-level whoopsies that your federation strategy as written (and apparently as not documented outside of the sourcecode) just cannot be made to work right.
They may have spent most of the summer architecting this thing and figuring out things like "seeds". When it came down to putting code on screen, as it were, they had minimal time to do so. All of these security issues feel like the release was rushed and that they might have been better off releasing it silently, without a way to deploy public nodes, and having a blog post explaining the situation.
In terms of micro-architecture, take a look at Wordpress. I love Wordpress, don't get me wrong, but it almost can't be made secured due to some design decisions that can't be reversed, such as "Wordpress templates contain executable code with direct unfiltered access to the database."
On reflection, I shouldn't have expressed this through mockery at all, but through helping. :(
Interesting I was thinking in comparison to PHP where a beginner would much more likely be individually assigning each input into a SQL statement. Sure there are all kind of possibilities that can go wrong there to but mass assignment looks like it makes it so easy for someone to shoot themselves in the foot.
If you open source something, unless it's perfectly written, wouldn't the hacking potential be... near 100%? If everyone can see how you do everything it seems like even a minor slip up will potentially surrender your site.
Could someone explain this (I'm probably missing a piece of the puzzle I can't place)?
Now, if you're a highly anticipated project and you're making errors covered in every Security 101 article which happen to be very visible, then OSSing your code makes it highly likely that people will see those, for good and ill. What scares me for Diaspora's future isn't those errors -- it is the part of the iceberg below the waterline. I mean, if you're steaming at full speed towards a gigantic "I'M GONNA RIP UP YOUR BOAT!" sign, there is probably something underwater and I doubt any qualified security guy (I am so not one) will donate you a few tens of thousands of dollars to tell you how screwed you are right now.
Black-boxing is where you throw known arguments at a system and measure the responses, and hence deducing what the system is doing
White-box testing is knowing what is happening internally, so you can immediately skip to step two of a security test - which is exploiting.
With black-boxing it takes a very very long time to learn the entire system and its workings, but it can be done. Having the code just means skip that test. For eg. with Diaspora from it launching to it being exploited was a matter of minutes.
The Twitter URL escaping bug from this week was from within one of their public source code repositories. While they don't release everything as open source, they have released enough to give an attacker a good view of their stack and how it works. Bugs not detected in the open source code are likely to also appear in other parts of the platform that are closed source (since they weren't detected in the first place)
That said, MPWILGSIANSE (my password is LadyGaga so I am no security expert)
Now, while it hasn't been their primary intension (most likely), to have the OSS community (and others curious about the code) provide fixes and feed back, they sure have gotten a lot of input and advice. I also suspect they have learned quite a lot through the feedback - positive, constructive, and inflammatory. You really can't ask much more -- aside from some direct help.
While I have no plan to run my own Diaspora node, I do look forward to seeing how the code and project evolve.
The comment negates some of the statements made in the post
There are no deep, tricky issues explained because absolutely zero effort was needed to find a half dozen breathtakingly bad practices floating at the surface.
So yes, of course it's trivially fixable. The problem is that it wasn't trivially fixed and they thought they were ready to release it.
The original post lists a lot of samples, the reply adds some more information to those, the "reply" moves this into almost to an argumentum ad hominem.
I'm just saying that it would be nice if the OP would at least delete the stuff that is just plain wrong.
The idea that 'these guys are going to get all this money, and work for three months, and then surely deliver this top notch 'facebook killer' that we all want to use!' is a bit absurd, that's all, given that there are plenty of other projects working in this area, and there would be no reason at all to think that this one would be superior, the best, or even suitable. If it wasn't for the nytimes, etc. this would be just another project on github. The media has taken a normal situation and screwed it up majorly, and it doesn't appear to me that it's going to turn out that great for anyone involved.