
Security for Open Source Code: Dynamic Analysis Is the Only Way - briandoll
https://blog.sourceclear.com/dynamic-analysis-is-the-only-way/
======
morbosoft
That is not what dynamic analysis means but I appreciate everyone redefining
terms to suit their marketing.
[https://en.wikipedia.org/wiki/Dynamic_program_analysis](https://en.wikipedia.org/wiki/Dynamic_program_analysis)

~~~
joshuata
Agreed. Their definition of static analysis is laughably bizarre as well;
there are many security and safety properties that can be proved by static
analysis but not by dynamic analysis. Additionally, many of the problems they
attribute to static analysis (non-specific versioning and recursive
dependencies) are caused by the package managers themselves, and can be solved
using static analysis if the build tool is well defined and well behaved.

Their tool has a few major weaknesses:

    
    
      1. Builds are not reproducible - Reproducible builds require pinned package versions, which they specifically avoid. This could result in security holes if a dependent package version was bumped after the parent project was tested.
    
      2. Subject of analysis - Their main target of analysis is the build file. This severely limits the extent of their analysis and requires them to build new tools for each build tool used.
    
      3. Underestimation - Since a code path must be exercised in order to analyze it, you are guaranteed that the set of vulnerabilities detected is a subset of (or equal to) the true set of vulnerabilities. This is the opposite of what I believe should be the default: Always prefer false positives over false negatives. Both static analysis and binary analysis allow the programmer to over-estimate their analysis, guaranteeing that all vulnerabilities will be found.

~~~
jwn
I'm not sure what you mean by #1, the builds are as reproducible as the build
system being used.

For #2, SourceClear doesn't build the software under test, it's hooked into
the build via various methods. For Maven and Gradle, those are plugins. For
NPM and Bundler, the existing build files contain the complete dependency
graphs as determined by the build system. The analysis is quite accurate, I
daresay more so than any other tool. Yes, it requires implementation work for
each build stack, but that's the price you pay for accuracy.

In response to your #3, SourceClear doesn't report only vulnerabilities
verified by call paths, it reports on _all_ components will known
vulnerabilities and denotes if a call path was found.

Disclaimer, I'm a Co-Founder and have spent a great deal of time writing
scanning code.

------
gaspar
Interesting. So if I understood correctly, you dynamically analyze the build
process (and that's why you use that term) instead of just parsing the build
file, because you don't know exactly how the dependencies are going to be
resolved from the package manager during the build process. How do you verify
that a specific version of a library is used during the build and not some
other version (you just do a hash lookup or have a way to generate signatures
with small false positives) ? Also, what happens if the package manager is
compromised (for example it informs you that she used version 2.0 but instead
she used a vulnerable version) ? For the call graphs, do you find the
relationships between each procedure for the whole project and if so, isn't
that literally static analysis ? Sorry if my questions don't make sense or are
trivial, I am just looking it from a research perspective, because I am
working on somehow similar things.

------
wyldfire
I agree w/all the confusion regarding terminology. It's still static if you're
not executing the code.

But, back to the premise: it would be really helpful if you [the author] could
illustrate security defects which can be detected using "dynamic analysis"
which cannot be detected with "static analysis." Legitimate, actually
exploited/able vulnerabilities would be ideal.

------
dkarapetyan
I don't understand what exactly they're saying. Seems to me they're comparing
apples, oranges, and cherries. I don't know if they're willingly doing this or
if they're trying to make the problem sound harder than it is.

You only need to perform whatever security analysis is necessary after all
your dependencies are resolved. This does not require anything "dynamic". You
just call the packager manager, wait for all the dependencies to be resolved,
then verify there are no vulnerable versions, which is just a matter of
looking up the relevant pieces in some database somewhere.

Which is fine if that's what they're doing but this article seems to be just
smoke and mirrors mostly.

~~~
briandoll
We published this because there are lots of legacy tools that approach this
problem with two specific weaknesses:

* Many tools just scan a text file where you declare your dependencies, which misses transitive dependencies and won't tell you exactly which versions are in use

* They use incomplete datasets for vulnerabilities, like CVE. Most OSS projects don't create CVEs for vulnerabilities, so it is a mostly useless datasource.

Besides having a complete view of the dependencies, we have a research team
that is finding and disclosing new vulnerabilities all the time, which you can
see here:
[https://www.sourceclear.com/registry/explore](https://www.sourceclear.com/registry/explore)

~~~
dkarapetyan
That's a lot more clear and a much better description. I didn't get any of
that from the post. Instead you explained several disparate processes each of
which has their place in security analysis and declared a specific version the
winner.

------
r-w
Except when the language itself is statically and strongly typed (a la
Haskell).

~~~
briandoll
I haven't worked with Haskell myself (and we don't currently support Haskell),
but it looks like Cabal is the package manager, which resolves the full graph
of dependencies for your project.

Most dependencies are specified like "network-info >= 0.2" (taken from a
random example:
[https://github.com/bitemyapp/blacktip/blob/master/blacktip.c...](https://github.com/bitemyapp/blacktip/blob/master/blacktip.cabal)),
which means you don't know which version will actually be used until you build
the project. Which is our point in this blog post. If you just scanned that
text file, you'd just be guessing at which versions of open source libraries
are actually being used.

~~~
r-w
I don’t have a very in-depth understanding of dynamic analysis myself, so this
may be somewhat of a simplistic question: Why not just resolve to the latest
version of each library/module present on the system during the first pass
through the list of dependencies?

~~~
briandoll
The package manager decides which packages are used based on how dependencies
are specified. For example, you might ask for library-A >= 1.1, but a
transitive dependency specification (a dependency of a dependency) may specify
library-A >= 1.0 && <= 1.6. If versions up to 1.7 are available, you'd
probably get 1.6. Probably. Because with some package managers, if you have
version 1.5 sitting around in a local cache, it may use that instead.

Basically, never try to guess what the package manager is going to resolve.
Just let it do its job, just as it does for your production builds, and use
that information to look up any associated vulnerabilities.

~~~
khedoros
You may be talking about two different situations. "r-w" might be thinking
about a build style where compilation just uses the versions of libraries
available on the system, rather than going to the package manager on a per-
build basis and re-fetching dependencies (so, something more like a C build
than a Rust, Haskell, or Node build).

As an aside, and echoed by a few other comments in the thread, what you're
calling "dynamic analysis" is what I know as "static analysis", like what I
want Coverity to do (watch the build process, and monitor the source that
actually goes into each of my build artifacts). "Dynamic analysis" brings to
mind something more like Valgrind, or some other tool that monitors and
profiles a program during execution.

