Hacker News new | comments | show | ask | jobs | submit login
[flagged] 20% of all Node.js modules found vulnerable to injection attacks (acolyer.org)
65 points by crispyporkbites 4 months ago | hide | past | web | favorite | 40 comments



This headline is entirely false - 20% of Node modules use 'eval' or 'exec'. While these are certainly less secure and a strong secure coding standard would probably ban them or at least reduce their use, it's entirely possible to use both in safe ways.


Further on:

> 18,924 of all 51,627 call sites are found to be statically safe (36.66%)

> The templates for the vast majority of call sites have at most one hole, and very few templates contain more than five.

If you've got a dependency that calls eval or exec, those aren't great odds that they're doing it safely.


That still requires you to know whether either is being called with user-provided values before you can say it's unsafe.

If, for example, someone had a utility which resized user-uploaded images you couldn't say simply calling exec to run something like ImageMagick was unsafe before checking whether it used the user's filename.


I think that's what they're saying in those quoted lines though. That 36% appeared to be statically-safe (i.e. the string sent to eval or exec was generated using only strings defined in the code) and the others were not.


True — it wasn't clear to me whether e.g. a random temporary filename would be considered static or only actual fixed strings in the code. (Similar questions would arise if e.g. a web app saved things using a database primary key so it's technically not static but the attacker doesn't control it even if they control the file contents)


Ahhhhh nice, I hadn't considered the "random temporary filename" or "db primary key" aspect either. That's something that gives me a small chill, but it's not the end of the world compared to "arbitrary user input piped to exec()"

Who knows how elaborate their static analyzer actually is in practice? I wrote a Python analyzer in grad school, and the results were both pretty interesting (type inference tracing through function calls) and pretty mediocre. It was also fiddly-as-hell to get working. Plus... the first rule of statically-analyzing dynamic languages is that the results get weird after the first eval()--anything could happen!


Yeah, as a developer I see plenty of appeals to dynamic languages but they’re certainly a nightmare for tool authors. I know OpenStack was working on a security scanner for Python – I wonder what percentage of the open issues start with “this is probably very hard but…”


That 20% includes dependencies. I'm sure the majority of those are cli tools (webpack etc), i.e. they don't take unsanitized input from web forms etc.


As someone with almost 40 years of Lisp experience, I firmly believe that the only correct place to call 'eval' is in the implementation of a REPL. For any other purpose that you might be tempted to use it for, there is a better way, that is not only more correct but also faster. Usually, the better way is to pass a function to be called rather than an expression to be evaluated.

There are more legitimate uses of 'exec', of course, but one still needs to be very careful with it. If there really are Node modules that pass their input to 'exec', that strikes me as very poor design.


> the only correct place to call 'eval' is in the implementation of a REPL.

Don't forget also: if you're reading uncompiled code from a file, that's an occasion for eval.


There are a handful of other valid uses, but there are many invalid ones. I recall, a couple of years ago, hearing about a Rails vulnerability caused by some API method calling 'eval' on one of its arguments if it happened to be a string, just as a convenience for some use case. The fact that it did so wasn't even documented. I was left with the impression that most programmers don't understand how to use 'eval', or rather, how and why not to use it.


It's not even that. It's more like "20% of Node modules use _or include a module somewhere in their dependency chain which uses_ 'eval' or 'exec'".

So if my backup utility happens to depend on a wrapper library for node's Child Process module (which probably doesn't sanitize its inputs, as that would break quite a bit of functionality), it's considered "vulnerable".


And why shouldn't it be?


Maybe because it's not sending user input into the "vulnerable" API? Or because it's not even calling the "vulnerable" API in the first place? Or because it's a system utility which will never be exposed to untrusted user input?


Possible, but unlikely. eval is particularly difficult to use correctly, and can almost always be replaced with something better.


Key word is vulnerable. There is a gap in protection and they are at risk to being potentially exploited.


Doesn't mean it isn't click bait. ;-)


> thousands of modules may be vulnerable

(emphasis mine)

Apparently any module that uses eval or exec counts in "may", regardless of whether there's an actual vulnerability. Then using dependency analysis they extrapolate the 3% of packages that actually use these features to 20% that depend on something that use them.

Maybe uses of eval or exec should be more closely audited for safety, but blanket stating that any use of core language features makes you vulnerable is just vacuous clickbait.

This is a Synode sales pitch.


"Vulnerable to injection attacks" seems like a bit of a stretch to me.

Direct quote: "then about 20% of all modules turn out to directly or indirectly depend on at least one injection API"

Where "injection API" is a reference to either the eval() or exec() functions.


To be fair they follow up with

> A staggering 90% of the call sites do not use any mitigation technique at all.

> Another 9% attempt to sanitise input using regular expressions. Unfortunately, most of those were not correctly implemented


This was posted 9 hours ago with the paper title as the name, and got 3 upvotes:

https://news.ycombinator.com/item?id=16566587

I thought it was interesting and relevant for this audience, so I put a more sensational (but arguably accurate - you can argue the semantics of vulnerable all day) headline in the title to grab HN reader's attention. Now it is top of the front page with 40 votes in 25 minutes.

If you want to read papers like this everyday, I got it from the morning paper mailing list: https://blog.acolyer.org/


That's against the site guidelines. Would you please read them and follow them when posting here?

https://news.ycombinator.com/newsguidelines.html


Hi Dan — along with rebuking the poster, I wish you had also fixed the title. There are substantive issues here worth discussing.


Ok, how about we put https://news.ycombinator.com/item?id=16566587 in the second-chance queue (described at https://news.ycombinator.com/item?id=11662380 and links back from there).

Edit: it's on the front page now.


So basically: click-bait / vote-bait works, even with audiences that I expect to have a larger degree of technical literacy overall than the general populace.

Sometimes I worry about humanity!


Upvote is commonly used to mean "read it later" by this audience. It is by no means a vote on quality.


I sometimes upvote even when I disagree, if I think it will lead to quality discussion.

I'm not sure this is one of those times, though.


I can't say as I would think to use an upvote that way. I'd leave the article open in a tab, add it to browser bookmarks, or add it to a list in my preferred notes/reminders app (currently Google's Keep).


This is an incorrect use of absolutes.


Is it? How do you see your upvoted content to read later?


In your profile there's a link to upvoted submissions.


Keep in mind that "9 hours ago" is after midnight/close to midnight for a large portion of the HN user base, so a small number of upvotes is not surprising.


But shouldn't the number of stories posted then be lower, which means it stays on /new for longer and gets more visibility?


More visibility is not quite correct: although it is on /new for longer, the number of people reading /new is much fewer, too.


The title here (20% of all Node.js modules found vunerable to injection attacks) seems more sensationalised then that of the actual article:

> Understanding and automatically preventing injection attacks on Node.js

The article further guards the hyperbole with phrasing like "may be vulnerable" which is not reflected in the HN submission title.

As click-baity as the title here could be interpreted, the article is highlighting what could be a significant problem: people often don't perform input validation well or create designs which make it difficult to do well at all.

The article may be understating the issue by concentrating on specific types of attack: it seems to be referring only to calls to exec() and eval(), so could underestimating the problem by only considering served-side JS and shell/OS injection vectors. I would think that database vectors and client-side vectors (resulting on possible XSS attacks) are pretty common too, probably more so. They certainly are in a lot of code I've seen both online and in private codebases, due to lax input sanity checking or worse: architectures that make good validation in this respect practically impossible.

This is not at all unique to node.

> A staggering 90% of the call sites do not use any mitigation technique at all.

Many libraries simply assume that the caller are performing input validation and only sending them sanitised commands. This is not safe, particularly in a publicly available library, as a lot of naive programmers will assume that the library is performing some validation and will raise exceptions in the presence of something dangerous. Even documenting the lack of validation won't help in many cases because how many people read documentation in detail until something goes wrong?!

> Another 9% attempt to sanitise input using regular expressions. Unfortunately, most of those were not correctly implemented.

This is a common problem too, the most problematic issue being people using regular expressions to try identify bad inputs. Trying to enumerate the bad is generally impossible as the range of bad inputs is usually infinitely larger then the range of good ones.


Title should be changed to "Synode: understanding and automatically preventing injection attacks on Node.js"

The issue is an important one to bring attention to, but as I leave this comment all the other comments are arguing whether the submitted title is accurate/click-bait.


The research paper was accepted at NDSS, so thats something.

Here is the video of the conference: https://www.youtube.com/watch?v=xVfLW2JhBq8

IMHO static code analysis for this seems utterly useless, dead-end path to improve real-world NodeJS security.

But this is definitely not a security software company trying to spread FUD like some suspecting here


"Vulnerable" due to eval() or exec()? Not really an "injection API".


That's click-bait


tl;dr people pushing security tools for nodejs say security tools are the new most important aspect of node servers because some of them use calls that could maybe possibly be used in an unsafe manner but currently don't appear to be.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: