> 18,924 of all 51,627 call sites are found to be statically safe (36.66%)
> The templates for the vast majority of call sites have at most one hole, and very few templates contain more than five.
If you've got a dependency that calls eval or exec, those aren't great odds that they're doing it safely.
If, for example, someone had a utility which resized user-uploaded images you couldn't say simply calling exec to run something like ImageMagick was unsafe before checking whether it used the user's filename.
Who knows how elaborate their static analyzer actually is in practice? I wrote a Python analyzer in grad school, and the results were both pretty interesting (type inference tracing through function calls) and pretty mediocre. It was also fiddly-as-hell to get working. Plus... the first rule of statically-analyzing dynamic languages is that the results get weird after the first eval()--anything could happen!
There are more legitimate uses of 'exec', of course, but one still needs to be very careful with it. If there really are Node modules that pass their input to 'exec', that strikes me as very poor design.
Don't forget also: if you're reading uncompiled code from a file, that's an occasion for eval.
So if my backup utility happens to depend on a wrapper library for node's Child Process module (which probably doesn't sanitize its inputs, as that would break quite a bit of functionality), it's considered "vulnerable".
Apparently any module that uses eval or exec counts in "may", regardless of whether there's an actual vulnerability. Then using dependency analysis they extrapolate the 3% of packages that actually use these features to 20% that depend on something that use them.
Maybe uses of eval or exec should be more closely audited for safety, but blanket stating that any use of core language features makes you vulnerable is just vacuous clickbait.
This is a Synode sales pitch.
Direct quote: "then about 20% of all modules turn out to directly or indirectly depend on at least one injection API"
Where "injection API" is a reference to either the eval() or exec() functions.
> A staggering 90% of the call sites do not use any mitigation technique at all.
> Another 9% attempt to sanitise input using regular expressions. Unfortunately, most of those were not correctly implemented
I thought it was interesting and relevant for this audience, so I put a more sensational (but arguably accurate - you can argue the semantics of vulnerable all day) headline in the title to grab HN reader's attention. Now it is top of the front page with 40 votes in 25 minutes.
If you want to read papers like this everyday, I got it from the morning paper mailing list: https://blog.acolyer.org/
Edit: it's on the front page now.
Sometimes I worry about humanity!
I'm not sure this is one of those times, though.
> Understanding and automatically preventing injection attacks on Node.js
The article further guards the hyperbole with phrasing like "may be vulnerable" which is not reflected in the HN submission title.
As click-baity as the title here could be interpreted, the article is highlighting what could be a significant problem: people often don't perform input validation well or create designs which make it difficult to do well at all.
The article may be understating the issue by concentrating on specific types of attack: it seems to be referring only to calls to exec() and eval(), so could underestimating the problem by only considering served-side JS and shell/OS injection vectors. I would think that database vectors and client-side vectors (resulting on possible XSS attacks) are pretty common too, probably more so. They certainly are in a lot of code I've seen both online and in private codebases, due to lax input sanity checking or worse: architectures that make good validation in this respect practically impossible.
This is not at all unique to node.
Many libraries simply assume that the caller are performing input validation and only sending them sanitised commands. This is not safe, particularly in a publicly available library, as a lot of naive programmers will assume that the library is performing some validation and will raise exceptions in the presence of something dangerous. Even documenting the lack of validation won't help in many cases because how many people read documentation in detail until something goes wrong?!
> Another 9% attempt to sanitise input using regular expressions. Unfortunately, most of those were not correctly implemented.
This is a common problem too, the most problematic issue being people using regular expressions to try identify bad inputs. Trying to enumerate the bad is generally impossible as the range of bad inputs is usually infinitely larger then the range of good ones.
The issue is an important one to bring attention to, but as I leave this comment all the other comments are arguing whether the submitted title is accurate/click-bait.
Here is the video of the conference: https://www.youtube.com/watch?v=xVfLW2JhBq8
IMHO static code analysis for this seems utterly useless, dead-end path to improve real-world NodeJS security.
But this is definitely not a security software company trying to spread FUD like some suspecting here