

Security of software repositories (yum, maven, npm, cpan, etc.) - dougdonohoe

As background, over the last couple of weeks I re-watched an old-favorite, Austin Powers (I was once Dr. Evil for Halloween) and I read A.I. Apocalypse (http:&#x2F;&#x2F;aiapocalypse.com&#x2F;) by William Hertling, which involves malicious viruses taking over millions of computers and them becoming self-aware.<p>This got me thinking about how an evil genius (or government agency) might try and infect or control millions of machines for nefarious purposes which led me to wonder about all the software that auto-updates or auto-fetches whether through yum, maven, npm, cpan or even chrome and firefox.<p>So my question is, how secure are those mechanisms?  Would it be possible to install malicious versions of popular open source libraries?   I understand there is signing and such, but what about getting something in the build before signing happens (e.g., alter the checked out code).  Like, say blackmailing a build engineer or hijacking a build system?<p>I&#x27;m asking because I&#x27;ve been using these systems for years and never really considered how to know if library X is actually &#x27;clean&#x27;.  I suspect many software developers are like this - just running &#x27;apt get&#x27; or &#x27;npm install&#x27; or &#x27;yum update&#x27; or &#x27;mvn&#x27; or &#x27;brew install&#x27; or ... what have you.<p>Have other developers wondered about this?  Do the security heads at these repositories lose sleep over this?<p>P.S.  I&#x27;m not an evil genius or government agency.<p>P.P.S.  Of course, who would admit to being an evil genius or from a government agency?
======
switch33
Software is usually verified by some form of hashing scheme like md5, sha-1,
or sha-2.

Windows used to run on md5 till it was proven it was horribly broken. A
collision could be generated for the typical md5 having two programs have the
same md5 disabling the security guarantees it usually provided. Part of
stuxnet was believed to be able to overcome the installed package checking
mechanisms.

SHA-1 has taken over for verification for a lot of sources including many
linux packages. As for it's security it is debatable to a certain extent.
There are efforts to port things to sha-2 but it hasn't been officially done
yet I think.

Malware persistence is a focus on the malware persisting on the computer or
network after infection. It is a very big field considering most organizations
use apps from many different software vendors.

When you install an app it can be in source form but it also may contain many
different binary(compiled sources). This is problematic. A good example is
Hadoop (famous for computing big data problems) contains lots of pre-compiled
.jars(java sources compiled).

Static analysis and verification became the go-to standards for a lot of study
on verifying binary integrity. But every format and packaged store is
different and has different levels of security. There are also more advanced
methods or protecting like capabilties(monitoring based on what the apps have
access to). A very good minimal for this are some of the libraries released by
google including shipshape:
[https://github.com/google/shipshape](https://github.com/google/shipshape)

Many companies can no longer get away with just installing a firewall or
relying on security products because security is now an "inside problem."
Where the software they download from their cloud service(openstack for
example) comes from many different vendors. Another example is docker(also
many different vendors used to package apps together but provide isolation
from the system).

SHA-1 has not been known to be broken, however there are some strange attacks
that show that it might be possible it is overcome-able through other means. A
prime questionable of how it is overcome example is ssh ebury
([http://www.welivesecurity.com/2014/02/21/an-in-depth-
analysi...](http://www.welivesecurity.com/2014/02/21/an-in-depth-analysis-of-
linuxebury/) ). SSH ebury seems to be unique in that it during regular
execution of the program despite there being no noticeable changes to the user
alas from running specific shell commands that are not normally run by regular
users it looks not infected/safe.

As for companies installing older versions of software with known remote code
execution vulnerabilities or other types of vulnerabilities that are very
severe, I do not know of many services or open source software that checks
against this! There are however many continuous integration services.
Continuous integration services if not done from your company though has
problems on verification as well.

A good monitoring practice would be to use homebrew(usually available for most
operating systems) to monitor and track what is installed on a regular basis
and have some form of whitelist.

Network Traffic analysis for companies is major headache as well. Traffic can
be compromised in many different ways. Wikipedia covert channels is a good
explanation of some of the complexity involved. There are many tricks for data
ex-filtration that make it basically undetectable by normal means (they
require some level of active-monitoring). Almost all anti-virus now include
network Man-in-the-middle traffic monitoring to do their work.

Anti-virus companies generally have two methods of classifying software as
good-ware vs. bad-ware. One is whitelisting and the other is blacklisting.
Whitelisting is a selective list of good software. Blacklisting is a selective
list of bad-ware. Both methods are generally debatable for which is better.
Despite the fact that your anti-virus itself could be compromised usually
having an anti-virus provides some level of security.

People despite all this realize that running a company without using pre-made
opensource code is rather ridiculous as so much runs on the app stores or
third party software sources(outside your company's control).

Security in many cases can be thought of as a castle with multiple walls or
layers of defense, but your firewall as being your main wall of defense is a
bad decision, because a wall is just what makes your castle stand up to some
attacks, it doesn't actively patrol your inner walls for insider threats.
Companies need active monitoring systems for access control logs espeically
since many common interactions can seem malicious but are just part of
people's daily jobs now.

A lot of security is moving to "anomaly detection" where anomalies are jolts
of irregularities in the normal business day. Some things that are bad can be
parts of normal activity, but done over a longer basis or with extended
activity may be hazardous to businesses (a good example is Denial of service
attakcs). While anomaly detection is a good trend with smart systems it also
suffers from the same "packaging and extending attack surface" like everything
else.

This is a short overview of how security works in general, though if you have
any questions feel free to ask.

